×

Scale-invariant vote-based 3D recognition and registration from point clouds. (English) Zbl 1251.68269

Cipolla, Roberto (ed.) et al., Machine learning for computer vision. Selected papers based on the presentations at the international computer vision summer school (ICVSS 2012), Sicily, Italy, July 7–15, 2012. Berlin: Springer (ISBN 978-3-642-28660-5/hbk; 978-3-642-28661-2/ebook). Studies in Computational Intelligence 411, 137-162 (2013).
Summary: This chapter presents a method for vote-based 3D shape recognition and registration, in particular using mean shift on 3D pose votes in the space of direct similarity transformations for the first time. We introduce a new distance between poses in this space – the SRT distance. It is left-invariant, unlike Euclidean distance, and has a unique, closed-form mean, in contrast to Riemannian distance, so is fast to compute. We demonstrate improved performance over the state of the art in both recognition and registration on a (real and) challenging dataset, by comparing our distance with others in a mean shift framework, as well as with the commonly used Hough voting approach.
For the entire collection see [Zbl 1248.68030].

MSC:

68T45 Machine vision and scene understanding
68U05 Computer graphics; computational geometry (digital and algorithmic aspects)
Full Text: DOI

References:

[1] Toshiba CAD model point clouds dataset
[2] Agrawal, M.: A Lie algebraic approach for consistent pose registration for general euclidean motion. In: Proc. Int. Conf. on Intelligent Robot and Systems, pp. 1891-1897 (2006)
[3] Arsigny, V.; Commowick, O.; Pennec, X.; Ayache, N.; Pluim, J. P.W.; Likar, B.; Gerritsen, F. A., A Log-Euclidean Polyaffine Framework for Locally Rigid or Affine Registration, Biomedical Image Registration, 120-127 (2006), Heidelberg: Springer, Heidelberg · doi:10.1007/11784012_15
[4] Ballard, D. H., Generalizing the Hough transform to detect arbitrary shapes, Pattern Recognition, 13, 2, 111-122 (1981) · Zbl 0454.68112 · doi:10.1016/0031-3203(81)90009-1
[5] Besl, P., McKay, N.: A method for registration of 3D shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence 14(2) (1992)
[6] Campbell, R. J.; Flynn, P. J., A survey of free-form object representation and recognition techniques, Computer Vision and Image Understanding, 81, 166-210 (2001) · Zbl 1011.68544 · doi:10.1006/cviu.2000.0889
[7] Cetingul, H.E., Vidal, R.: Intrinsic mean shift for clustering on Stiefel and Grassmann manifolds. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1896-1902 (2009)
[8] Chen, H.; Bhanu, B., 3d free-form object recognition in range images using local surface patches, J. Pattern Recognition Letters, 28, 1252-1262 (2007) · doi:10.1016/j.patrec.2007.02.009
[9] Cheng, Y., Mean shift, mode seeking, and clustering, IEEE Trans. on Pattern Analysis and Machine Intelligence, 17, 790-799 (1995) · doi:10.1109/34.400568
[10] Davies, P. I.; Higham, N. J., A Schur-Parlett algorithm for computing matrix functions, SIAM J. Matrix Anal. Appl., 25, 464-485 (2003) · Zbl 1052.65031 · doi:10.1137/S0895479802410815
[11] Drost, B., Ulrich, M., Navab, N., Ilic, S.: Model globally, match locally: Efficient and robust 3D object recognition. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 998-1005 (2010)
[12] Eggert, D. W.; Lorusso, A.; Fisher, R. B., Estimating 3-d rigid body transformations: a comparison of four major algorithms, Machine Vision Application, 9, 272-290 (1997) · doi:10.1007/s001380050048
[13] Ashbrook, A. P.; Fisher, R. B.; Robertson, C.; Werghi, N.; Burkhardt, H.; Neumann, B., Finding Surface Correspondence for Object Recognition and Registration Using Pairwise Geometric Histograms, Computer Vision - ECCV’98, 674 (1998), Heidelberg: Springer, Heidelberg · doi:10.1007/BFb0054772
[14] Fréchet, M., Les éléments aléatoires de nature quelconque dans un espace distancié, Ann. Inst. H. Poincaré, 10, 215-310 (1948) · Zbl 0035.20802
[15] Frome, A.; Huber, D.; Kolluri, R.; Bülow, T.; Malik, J.; Pajdla, T.; Matas, J. (G.), Recognizing Objects in Range Data Using Regional Point Descriptors, Computer Vision - ECCV 2004, 224-237 (2004), Heidelberg: Springer, Heidelberg · Zbl 1098.68766 · doi:10.1007/978-3-540-24672-5_18
[16] Gall, J., Lempitsky, V.: Class-specific hough forests for object detection. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1022-1029 (June 2009)
[17] Johnson, A. E.; Hebert, M., Using spin images for efficient object recognition in cluttered 3d scenes, IEEE Trans. on Pattern Analysis and Machine Intelligence, 21, 5, 433-449 (1999) · doi:10.1109/34.765655
[18] Kazhdan, M., Funkhouser, T., Rusinkiewicz, S.: Rotation invariant spherical harmonic representation of 3d shape descriptors. In: Proc. Eurographics/ACM SIGGRAPH Symp. on Geometry Processing, pp. 156-164 (2003)
[19] Khoshelham, K.: Extending generalized Hough transform to detect 3D objects in laser range data. In: Workshop on Laser Scanning, vol. XXXVI, pp. 206-210 (2007)
[20] Knopp, J.; Prasad, M.; Willems, G.; Timofte, R.; Van Gool, L.; Daniilidis, K.; Maragos, P.; Paragios, N., Hough Transform and 3D SURF for Robust Three Dimensional Classification, Computer Vision - ECCV 2010, 589-602 (2010), Heidelberg: Springer, Heidelberg · doi:10.1007/978-3-642-15567-3_43
[21] Leibe, B.; Leonardis, A.; Schiele, B., Robust object detection with interleaved categorization and segmentation, Int. J. Computer Vision, 77, 1-3, 259-289 (2008) · doi:10.1007/s11263-007-0095-3
[22] Mamic, G.; Bennamoun, M., Representation and recognition of 3d free-form objects, Digital Signal Processing, 12, 1, 47-76 (2002) · doi:10.1006/dspr.2001.0412
[23] Mian, A. S.; Bennamoun, M.; Owens, R. A., Automatic correspondence for 3D modeling: an extensive review, Int. J. Shape Modeling, 11, 2, 253-291 (2005) · Zbl 1092.68693 · doi:10.1142/S0218654305000797
[24] Mian, A. S.; Bennamoun, M.; Owens, R., Three-dimensional model-based object recognition and segmentation in cluttered scenes, IEEE Trans. on Pattern Analysis and Machine Intelligence, 28, 10, 1584-1601 (2006) · doi:10.1109/TPAMI.2006.213
[25] Moakher, M., Means and averaging in the group of rotations, SIAM J. Matrix Anal. Appl., 24, 1-16 (2002) · Zbl 1028.47014 · doi:10.1137/S0895479801383877
[26] Mundy, J. L.; Ponce, J.; Hebert, M.; Schmid, C.; Zisserman, A., Object Recognition in the Geometric Era: A Retrospective, Toward Category-Level Object Recognition, 3-28 (2006), Heidelberg: Springer, Heidelberg · doi:10.1007/11957959_1
[27] Okada, R.: Discriminative generalized hough transform for object dectection. In: Proc. Int. Conf. on Computer Vision, pp. 2000-2005 (October 2009)
[28] Opelt, A., Pinz, A., Zisserman, A.: Learning an alphabet of shape and appearance for multi-class object detection. Int. J. Computer Vision 80(1) (2008)
[29] Osada, R.; Funkhouser, T.; Chazelle, B.; Dobkin, D., Shape distributions, ACM Trans. Graph., 21, 807-832 (2002) · Zbl 1331.68256 · doi:10.1145/571647.571648
[30] Pelletier, B., Kernel density estimation on Riemannian manifolds, Statistics Probability Letters, 73, 3, 297-304 (2005) · Zbl 1065.62063 · doi:10.1016/j.spl.2005.04.004
[31] Pennec, X., Intrinsic statistics on Riemannian manifolds: Basic tools for geometric measurements, JMIV, 25, 1, 127-154 (2006) · Zbl 1478.94072 · doi:10.1007/s10851-006-6228-4
[32] Pennec, X.; Ayache, N., Uniform distribution, distance and expectation problems for geometric features processing, J. Math. Imaging Vis., 9, 49-67 (1998) · Zbl 0906.68198 · doi:10.1023/A:1008270110193
[33] Petrelli, A., Di Stefano, L.: On the repreatability of the local reference frame for partial shape matching. In: Proc. Int. Conf. on Computer Vision (2011)
[34] Rusu, R.B., Blodow, N., Beetz, M.: Fast point feature histograms (fpfh) for 3d registration. In: Proc. Int. Conf. Robotics and Automation, pp. 3212-3217 (2009)
[35] Saupe, D.; Vranic, D. V.; Radig, B.; Florczyk, S., 3D Model Retrieval with Spherical Harmonics and Moments, Pattern Recognition, 392 (2001), Heidelberg: Springer, Heidelberg · Zbl 1038.68854 · doi:10.1007/3-540-45404-7_52
[36] Schramm, É., Schreck, P.: Solving geometric constraints invariant modulo the similarity group. In: Int. Conf. on Computational Science and Applications, pp. 356-365 (2003)
[37] Shotton, J. D.J.; Blake, A.; Cipolla, R., Multiscale categorical object recognition using contour fragments, IEEE Trans. on Pattern Analysis and Machine Intelligence, 30, 7, 1270-1281 (2008) · doi:10.1109/TPAMI.2007.70772
[38] Srivastava, A.; Klassen, E., Monte Carlo extrinsic estimators of manifold-valued parameters, IEEE Trans. on Signal Processing, 50, 2, 299-308 (2002) · doi:10.1109/78.978385
[39] Subbarao, R., Meer, P.: Nonlinear mean shift for clustering over analytic manifolds. In: Proc. IEEE Conf. on Computer Vision and Pattern Recognition, vol. I, pp. 1168-1175 (2006)
[40] Subbarao, R., Meer, P.: Nonlinear mean shift over Riemannian manifolds. Int. J. Computer Vision 84(1) (2009) · Zbl 1477.68426
[41] Tombari, F., Di Stefano, L.: Object recognition in 3D scenes with occlusions and clutter by Hough voting. In: Proc. Pacifc-Rim Symp. on Image and Video Technology, pp. 349-355 (2010)
[42] Tombari, F., Salti, S., Di Stefano, L.: Unique signatures of histograms for local surface description. In: Proc. European Conf. on Computer Vision (2010)
[43] Vogiatzis, G.; Hernández, C., Video-based, real-time multi view stereo, Image and Vision Computing, 29, 7, 434-441 (2011) · doi:10.1016/j.imavis.2011.01.006
[44] Woodford, O.J., Pham, M.-T., Maki, A., Perbet, F., Stenger, B.: Demisting the Hough transform for 3D shape recognition and registration. In: British Machine Vision Conference (2011)
[45] Roger, P.: Woods. Characterizing volume and surface deformations in an atlas framework: theory, applications, and implementation. NeuroImage, 18(3):769-788 (2003)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.