Estimation of 3D Point Cloud Coordinates Using Stereo Matching and Superpixel Segmentation
M. G. Mozerova, *, V. I. Kobera, c, **, V. N. Karnaukhova, and L. V. Ziminab
aInstitute for Information Transmission Problems, Russian Academy of Sciences, Moscow, 127051 Russia
bMoscow Polytechnic University, Moscow, 107023 Russia
cCenter for Scientific Research and Higher Education, Ensenada, 22860 Mexico
email: *mozer@iitp.ru
email: **vitaly@iitp.ru
Received 21 March, 2025
Abstract— An important element of the movement of robotic systems is the determination of the object’s position in the coordinates associated with the scene, as well as the vector of the autonomous device’s own movement in this coordinate system (rotation matrix and displacement vector). All these parameters can be determined by estimating the values of the 3D coordinates of the cloud of scene points associated with the coordinate system of the onboard video camera of an autonomous robotic device. The process of finding the rotation matrix and the displacement vector is called the registration of three-dimensional point clouds and is of great importance in robotics and computer vision. Therefore, there are many methods of such registration. However, the problem is how to estimate the values of the 3D coordinates of the point cloud of the scene. In this paper, we propose to determine the desired coordinates of points using stereomatching. The main assumption of classical stereomatching is that matching pixels in stereo images have the same brightness values. However, this assumption is generally incorrect, given the presence of noise in the images and the different illumination of the left and right images in the stereo pair. In addition, pixel-by-pixel mapping does not work in areas where there is no real texture. The methods that solve all the problems of dense stereo patching are quite complex and require high computational costs. However, the problem can be solved in a simpler and less costly way, given that the clouds intended for registration consist of a small number of points, significantly smaller than the pixels in the image. Therefore, stereomatching in our case is carried out based on a certain neighborhood of the matched pixels, called a superpixel. The accuracy of such matching surpasses the use of a feature close to the principle of robust sensation. In addition, we have proposed an original method for verifying the reliability of the disparity determination based on local autocorrelation, thus removing cloud points with unreliable disparity estimates. As a result, our estimate of the 3D coordinates of “reliable” scene points is comparable to the best state-of-the-art deep learning algorithms.
Keywords:
computer vision,
point cloud,
stereomatching,
superpixel segmentation
DOI: 10.1134/S1064226925700251