In: Paparoditis N., Pieirot-Deseilligny M.. Mallet C.. Tournaire O. (Eds), 1APRS, Vol. XXXVIII. Part ЗА - Saint-Mandé, France. September 1-3. 2010
5 LIMITATIONS
The ordinary color based GMM background subtraction cannot
distinguish between foreground and background when the color
difference is small due to its pixel based nature. The depth values
gained from a ToF camera provide the ability for a correct clas
sification of all image blocks with depth values different from
those of the background as long as there are valid depth measure
ments for the background. As illustrated in figure 3 classifica
tion based only on low resolution depth values will result in an
unnatural foreground contour due to block artifacts. Practically,
this drawback cannot be resolved in typical situations, because in
such areas the background usually continues with the same color
so that there is no edge that would allow gradient based methods
to smooth the contour of the foreground mask correctly. Other
wise, bilateral filtering, see (Tomasi and Manduchi, 1998), which
is often used in the context of 2D/3D videos to enhance the res
olution in depth, would be able to reconstruct the true contour of
the object.
To resolve the general case contour estimation methods that in
corporate knowledge of the object given a priori or learned through
time are necessary, but it does not seem to be possible to achieve
good results in a not strictly defined setting. Only in the opposite
case, when inappropriate depth measurements result in a wrong
classification, gradient based methods can be applied to smooth
the contour.
Cannot be resolved
T
4
\
\
/t
i
Possibly resolvable
Figure 3: Illustration of a foreground mask. Dark yellow: fore
ground object, light yellow: background with similar color, red:
detected object contour
6 CONCLUSION
In this paper the standard method for background subtraction
based on Gaussian Mixture Models is adapted to operate on videos
acquired with a 2D/3D camera. The proposed method was com
pared to standard and previous methods using simple 2D/3D video
sequences. Qualitative as well as quantitative results were pre
sented and it was found that the proposed method is able to com
pensate for misclassification of pixels due to color similarities be
tween foreground objects and the background by utilizing depth
and modulation amplitude information without harming the high
resolution contour of foreground objects. Furthermore, this method
provides a clearly improved treatment of shadows and noise com
pared to previous methods.
The additional burden compared with standard background sub
traction methods based on GMM to process and maintain the
depth values is small, i.e., on current PCs real-time processing
is easily possible.
ACKNOWLEDGEMENTS
This work was funded by the German Research Foundation (DFG)
as part of the research training group GRK 1564 ’Imaging New
Modalities’ and the authors would like to thank Omar E. Lopprich
for the help with recording the 2D/3D videos and the valuable
discussions.
REFERENCES
Bartczak, B. and Koch. R., 2009. Dense depth maps from low
resolution time-of-flight depth and high resolution color views.
In: Proc. of ECCV Workshop on Multi-camera and Multi-modal
Sensor Fusion Algorithms and Applications, Lecture Notes in
Computer Science, Vol. 5876, pp. 228-239.
Bianchi, L., Dondi, P, Gatti. R., Lombardi. L. and Lombardi, P,
2009. Evaluation of a foreground segmentation algorithm for 3d
camera sensors. In: ICIAP, Lecture Notes in Computer Science,
Vol. 5716, Springer, pp. 797-806.
Chan, D., Buisman, H., Theobalt, C. and Thrun, S., 2008. A
noise-aware filter for real-time depth upsampling. In: Proc. of
ECCV Workshop on Multi-camera and Multi-modal Sensor Fu
sion Algorithms and Applications.
Crabb, R., Tracey, C., Puranik, A. and Davis, J., 2008. Real-time
foreground segmentation via range and color imaging. In: Com
puter Vision and Pattern Recognition Workshops, 2008. CVPRW
'08., pp. 1-5.
Ghobadi, S. E., Loepprich, O. E., Ahmadov, F., Bernshausen, J.,
Hartmann, K. and Loffeld, O., 2008. Real time hand based robot
control using 2d/3d images. In: ISVC '08: Proceedings of the
4th International Symposium on Advances in Visual Computing,
Part II, Springer-Verlag, Berlin, Heidelberg, pp. 307-316.
Harville, M., Gordon, G. and Woodfill, J., 2001. Foreground seg
mentation using adaptive mixture models in color and depth. In:
Proceedings of the IEEE Workshop on Detection and Recogni
tion of Events in Video, IEEE Computer Society, Los Alamitos,
CA, USA, pp. 3-11.
Leens, J., Pierard, S., Barnich, O., Droogenbroeck, M. V. and
Wagner, J.-M., 2009. Combining color, depth, and motion for
video segmentation. In: ICVS ’09: Proceedings of the 7th In
ternational Conference on Computer Vision Systems, Springer-
Verlag, pp. 104-113.
Lindner, M., Lambers, M. and Kolb, A., 2008. Sub-pixel data
fusion and edge-enhanced distance refinement for 2d/3d images.
Int. J. Intell. Syst. Technol. Appl. 5(3/4), pp. 344-354.
Prasad, T., Hartmann, K., Wolfgang, W., Ghobadi, S. and Sluiter,
A., 2006. First steps in enhancing 3d vision technique using
2d/3d sensors. In: 11. Computer Vision Winter Workshop 2006,
Czech Society for Cybernetics and Informatics, University of
Siegen, pp. 82-86.
Rajagopalan, A. N., Bhavsar, A., Wallhoff, F. and Rigoll, G.,
2008. Resolution enhancement of pmd range maps. In: Pro
ceedings of the 30th DAGM symposium on Pattern Recognition,
Springer-Verlag, pp. 304-313.
Schuon, S., Theobalt, C., Davis, J. and Thrun, S., 2008. High-
quality scanning using time-of-flight depth superresolution. In:
Computer Vision and Pattern Recognition Workshops, 2008.
CVPRW ’08. IEEE Computer Society Conference on, pp. 1-7.