The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences. Vol. XXXVII. Part B3b. Beijing 2008
Figure 6: 4. Step: (a) Result before editing, (b) Result after editing, (c) Prototypes.
prototypes from the initialisation and used them to find at least
one new instance in the facade image shown in Fig. 4. Than we
recursively searched for all probable new instances and classified
them according to the learned models. Instances that could not
be clearly matched to one class were presented to the user. This
way a new class 1-3 was established, marked black. The result
is shown in Fig. 4(a). Of course, the classifiers were not very
robust at this stage as they were learned from only a few exam
ples. Hence there were some misclassifications that were man
ually corrected by the user. The result after editing is shown in
Fig. 4(c). The image patches shown were used to update the class
hierarchy, that is, to update already known prototypes and the ini
tialisation of new classes as well as to update the subspace rep
resentations. As the dimension of the feature subspace increases
with every new class, we pass on the presentation of features in
subspace like Fig. 3(d). Note, that we started with only one ex
ample of a certain size and height-width-ratio, respectively. Thus
we did not recognise the bigger windows. This would be fixed
when defining a new example of a different size.
Fig. 5 and 6 shows the results of step 3 and 4 the same way. Step 3
has established a new subclass 1 -4. A sample of this class was de
tected and correctly classified within the next image. As instances
of class 1-1 occur in every image - there were 30 instances found
within the first three images - the classifier became more robust.
Hence the classification results for the image of Fig. 6 are quite
better than for the first ones.
6 CONCLUSIONS AND FUTURE WORK
We gave a concept for an incremental learning scheme. Given one
example of an object within a rectified image and given the prior
knowledge that the object appears several times in the image, we
learn the variation in appearance of the class of the given object
to detect further instances in other images. We have shown a
recursive procedure to find similar objects within an image. By an
unsupervised clustering we are able to make a hypothesis about
different object classes within an image. Finally, by minor help of
the user we identify new object classes or new subclasses among
the found clusters. That way, we build up an object hierarchy
of classes and subclasses with minimal user interaction that is
updated with every new image.
The results of the recursive search and the clustering procedure
up to now depend too much on the choice of the thresholds Tj
and T2. We will either use an optimisation procedure to find best
thresholds Ti and T2, e. g. using a hierarchical clustering pro
cedures, such as dendrogramms, to find an optimal threshold for
T2. Or, as an alternative, we might use a more sophisticated clus
tering procedure which contains an optimization function, thus
avoiding the need for setting thresholds.
To increase the classification performance we will adapt the cur
rent subspace methods to an incremental LDA, cf. Uray et al.
(2007). However, we get a feature vector that can be expanded
by further features, e.g. depth information, obtained from surface
reconstruction given prior knowledge of symmetry, which we can
assume for objects like windows and balconies, cf. (Hong et al.,
2004), (Yang et al., 2005) and then can be used for increasing
classification performance.
ACKNOWLEDGEMENTS
This work is founded by the EU-Project 027113 eTRIMS, E -
Training for Interpreting Images of Man-Made Scenes.
References
Belhumeur, P. N., Hespanha, J. and Kriegman, D. J., 1997. Eigenfaces vs.
fisherfaces: Recognition using class specific linear projection. PAMI
19(7), pp. 711-720.
Buhmann, J. M. and Hofmann, T., 1994. A maximum entropy approach
to pairwise data clustering. In: ICPR, pp. 207-212.
Fei-Fei, L., Fergus, R. and Perona, P, 2004. Learning generative vi
sual models from few training examples: an incremental Bayesian
approach tested on 101 object categories. In: CVPR Workshop on
Generative-Model Based Vision.
Fidler, S., 2006. Combining Reconstructive and Discriminative Subspace
Methods for Robust Classification and Regression by Subsampling.
PAMI 28(3), pp. 337-350.
Hofmann, T. and Buhmann, J. M., 1997. Pairwise data clustering by
deterministic annealing. In: PAMI, Vol. 19, pp. 1-14.
Hong, W., Yang, A. Y., Huang, K. and Ma, Y., 2004. On symmetry and
multiple-view geometry: Structure, pose, and calibration from a single
image. IJCV 60(3), pp. 241-265.
Li, L.-J., Wang, G. and Fei-Fei, L., 2007. OPTIMOL: automatic Object
Picture collecTion via Incremental MOdel Learning. In: CVPR.
Sivic, J. and Zisserman, A., 2006. Video Google: Efficient visual search
of videos. In: Toward Category-Level Object Recognition, LNCS, Vol.
4170, Springer, pp. 127-144.
Skocaj, D., Uray, M., Leonardis, A. and Bischof, H., 2006. Why to
combine reconstructive and discriminative information for incremental
subspace learning. In: CVWW 2006, Tele, Czech Republic.
Uray, M., Skocaj, D., Roth, P. M., Bischof, H. and Leonardis, A., 2007.
Incremental LDA Learning by Combining Reconstructive and Dis
criminative Approaches. In: BMVC.
Van Gool, L., Zeng, G., Van den Borre, F. and Miiller, R, 2007. Towards
mass-produced building models. In: PIA, Vol. 36, Munich, Germany,
pp.209-220.
Yang, A. Y, Huang, K., Rao, S., Hong, W. and Ma, Y, 2005. Symmetry-
based 3-d reconstruction from perspective images. In: CVIU,
Vol. 99number 2, pp. 210-240.
404