Multi-View Stereo Network With Gaussian Distribution Iteration

oleh: Xiaohan Zhang, Shikun Li

Format: Article
Diterbitkan: IEEE 2023-01-01

Deskripsi

Multi-view stereo estimates the depth maps of multiple perspective images in a scene and then fuses them to generate a 3D point cloud of the scene, which is an essential technology of 3D reconstruction. In this paper, we propose a deep learning method GDINet, applying probabilistic methods to the pyramid framework, which can significantly improve reconstruction quality. In detail, we first establish a Gaussian distribution for each image&#x2019;s pixel and iterate it in the pyramid framework. The mean value is the estimated depth, and the variance represents the depth estimation error. In addition, we design a novel loss function with excellent convergence to train our network. Finally, we present an initialization module to generate the coarse Gaussian distribution, controlling the parameters in a reasonable range. Our results rank <inline-formula> <tex-math notation="LaTeX">$2nd$ </tex-math></inline-formula> on both DTU and Tanks &#x0026; Temples datasets, showing that our network has high accuracy, completeness, and robustness. We also make a visualization comparison on the BlendedMVS dataset (containing many aerial scene images) to demonstrate the generalization ability of our model.