16:00 - 17:30 - May 18 (Tuesday)
Session chair: Andrew Willis, University of North Carolina, Charlotte
In this paper, we consider multi-sensor data fusion problem in a Simultaneous Localization And Mapping (SLAM) application. More precisely, we tackle the integration of inertial information in Bundle Adjustment. The BA cost function is composed of two weighted objectives, one from each of the sensor, and we interest the problem of finding an efficient weight that relies the two sensors errors. An investigation of three fully automatic and real-time methods for determining the weight in an extended Bundle Adjustment is presented.
This paper presents a method for detecting, describing, and matching keypoints in combined range-intensity data. We extend a 2D image-based detection and description framework to 3D using an image back-projected onto a range scan. A key feature of the framework is a physical scale space for detecting keypoints, which eliminates errors in scale during both detection and matching. We develop smoothing, differentiation, and description techniques that are focused on making the keypoint invariant to viewpoint, sampling, and intensity changes. We demonstrate the power of our algorithm with comparisons to a back-projected SIFT algorithm, showing that it is able to find and match keypoints in a variety of challenging scan pairs.
We present a method for detecting correspondences between non-rigid shapes, that utilizes surface descriptors based on the eigenfunctions of the Laplace-Beltrami operator. We use clusters of probable matched descriptors to resolve the sign ambiguity in matching the eigenfunctions. We then define a matching cost that measures both the descriptor similarity, and the similarity between corresponding geodesic distances measured on the two shapes. We seek for correspondence by minimizing the above cost. The resulting combinatorial problem is then reduced to the problem of matching a small number of feature points using quadratic integer programming.
We describe a method for obtaining a set of temporally coherent segments from an evolving 3D scene. The scene is reconstructed independently at various frames in a sequence, using multi-view silhouette and stereo information. We consider generic scenes composed of multiple human or non-human actors interacting with each other. Due to the generality of this problem, no model can be imposed on the reconstruction process, and the resulting meshes suffer from various geometric and topological inconsistencies. To analyze such data, we exploit a crucial insight - that convex segments detected in such a scene usually represent the limbs of an articulated body, and therefore exhibit rigid motions over time. Our method describes a given dynamic 3D scene through a set of approximately convex segments and their rigid motions over time, computing both automatically.