Self-supervised monocular depth estimation based on pseudo-pose guidance and grid regularization

摘要

Self-supervised monocular depth estimation (SMDE) has emerged as a promising alternative to generate a dense depth map in outdoor scenarios because of its low requirements for training data and sensors. However, training with only consecutive temporal frames without depth ground truth causes problems such as a lack of global supervision information and inaccurate depth estimation in low-texture areas. In this study, we propose a pseudo-label generation (PLG) module, a two-stream pose estimation (TSPE) structure, and a grid regularization loss function to address these issues. Here, the PLG is used to automatically generate pseudo-grid and pseudo-pose in the data preprocessing stage. The pseudo-grid provides reliable global position and direction information for supervision, while the pseudo-pose is used in TSPE to provide more geometric information. The TSPE fuses the geometry-based pseudo-pose and the network-based pose with a simple structure. The generalization of the pose estimation is improved by providing more geometric information. By handling the negligible depth error in the low-texture area, the proposed grid regularization loss function improves the depth estimation performance. Experiments show that our methods can improve the depth estimation performance, especially in the object boundary and low-texture area, with no additional training data or model parameters.

关键词

Monocular depth estimation Self-supervised learning Pseudo-label generation Grid regularization