Full text: XVIIIth Congress (Part B3)

5.4 Minimization 
The main problem of non-linear parameter estima- 
tion is to find a method which guarantees conver- 
gence of the cost function (eq. 20) to a global mini- 
mum. The minimization using the Levenberg-Mar- 
quardt method (see [20]), which is a combination of 
Newton’s method and a gradient descent, converges 
to the nearest local minimum. The global minimum 
is found with good initial parameter values. How- 
ever, we do not have initial parameter estimates. 
Thus, we divide the global model fitting problem 
into three steps to enhance and monitor the param- 
eter estimates. 
Step I: In the first step, the poses of all objects are 
reconstructed individually, and separately for each 
camera view. This procedural knowledge belongs to 
the concept RC_OBJECT and is inherited by every 
specialization. The projection of one object model 
depends on 7 parameters. As few parameters are 
to be estimated, the individual reconstructions are 
performed very quickly; however the minimizations 
have to be monitored in order not to let them con- 
verge to false local minima because of inappropriate 
initial values. If the focal length leaves an admissible 
range (10-100mm in our case), the object is rotated 
by negating two rotational parameters and the min- 
imization is restarted with the other parameters re- 
set to their original initial values. The cost function 
is also monitored during minimization. If the pro- 
cess converges to a local minimum with inadmissible 
high costs, the z-translation parameter is modified 
according to a predefined scheme. This monitored 
Levenberg-Marquardt iteration is stopped if either 
the change of the parameter estimates from one iter- 
ation step to the next is less than a given threshold, 
or if the model fitting does not succeed, i.e. if a max- 
imum number of iterations is reached or if the same 
local minimum is found despite modified parameter 
values. 
Step II: If a successful instance of a reconstructed 
object is created then it is added as part of 
RC.viEW. This concept performs step II of the min- 
imization process. For a given camera view the me- 
dian of all estimates of the focal length from step I 
is fixed at this step and it is used to reconstruct the 
pose of each object in the scene. So during this step, 
better initial estimates for objects’ poses are derived 
for each view of the scene. 
Step III: The median focal length and the resulting 
objects’ poses of step II are used as initial values for 
global model fitting. It is possible to estimate the 
relative pose between different cameras from the ob- 
ject correspondences. This step is part of the proce- 
dural knowledge of the concept RC_SCENE. Within 
this step it is possible to instantiate the concept 
RC_CAM_PARAM. 
720 
International Archives of Photogrammetry and Remote Sensing. Vol. XXXI, Part B3. Vienna 1996 
5.5 Camera Parameter Estimation 
Classical camera calibration methods (e.g. [28]) can 
not be performed on-line as they demand a special 
calibration pattern. Depth estimation is then a two- 
step process and it may lead to suboptimal solu- 
tions. We have explicitly modeled the camera pa- 
rameters in our projection functions and thus they 
are estimated using the knowledge of the 3D struc- 
ture of the objects in the scene as part of the pro- 
cedural knowledge of the concepts RC_SCENE and 
RC_cAM_PARAM. We estimate the external cam- 
era parameters and the focal length. The results 
show that principal point and scale factors are sta- 
ble enough for our off-the-shelf CCD cameras to as- 
sume fixed values. The influence of lens distortion 
to the results of our approach is quite small. Never- 
theless, it is possible to model the estimation of lens 
distortion in a manner similar to that of [10]. 
Tsai [28] shows that full camera calibration is pos- 
sible with five coplanar reference points. A solu- 
tion for calibration derived with four coplanar points 
is unique because four coplanar points determine a 
collineation in a plane and any further imaginary 
points in that plane as intersections of lines between 
lines through the four points can be derived. Six non 
coplanar points determine a unique solution as well 
(see [30]). 
Scene reconstruction is possible with one camera 
view. Taking a stereo image leads to much more ro- 
bust results. Furthermore, the pose of a circle with 
known radius can not be computed uniquely from 
one view (see [11]). Taking at least two images for 
reconstruction, the pose of a circle in space is, if the 
focal lengths are known, uniquely defined up to the 
direction of its normal vector (ref. [4]). The sign of 
the normal can be determined due to the visibility 
of the projected ellipse. 
5.6 Results 
Fig. 8 shows the object recognition results and 
the 3D reconstruction of a stereo image typical for 
our scenario. In Fig. 8 a) and b) the instances 
of the corresponding specializations of the concept 
PE_OBJECT (names in German) and their image re- 
gions, obtained by the color segmentation, are visu- 
alized. All objects are recognized correctly. Only in 
the right image the small ring is missing. This is cor- 
rected taking the left image in the 3D reconstruction 
processes. Fig. 8 c) shows the final result of the 3D 
scene reconstruction (instance of RC_SCENE). The 
geometric object models are projected onto the right 
image. The projected object models fit very well to 
the objects in the images. 
6 Conclusion 
Based on a detailed discussion of object modeling 
for object recognition and scene interpretation, a 
       
   
   
   
   
   
   
   
   
  
   
   
   
   
   
   
  
   
  
   
  
   
  
   
  
   
  
   
  
   
  
   
  
   
  
  
  
  
  
  
   
   
   
   
   
   
   
   
   
    
    
  
  
    
   
    
Fig 
obj
	        
Waiting...

Note to user

Dear user,

In response to current developments in the web technology used by the Goobi viewer, the software no longer supports your browser.

Please use one of the following browsers to display this page correctly.

Thank you.