03/27/2025
By Anugrah Vaishnav
Candidate Name: Anugrah Vaishnav
Defense Date: Thursday, April 10, 2025
Time: 9 - 10 a.m. EST
Location : Via Zoom
Thesis Title: Enhanced 3D Reconstruction in Colonoscopy Through Monocular Depth and Pose Estimation Using Transformer-based Neural Networks
Committee members:
Yu Cao (advisor), Miner School of Computer and Information Sciences, University of Massachusetts Lowell
Benyuan Liu (member), Miner School of Computer and Information Sciences, University of Massachusetts Lowell
Ming Shao (member), Miner School of Computer and Information Sciences, University of Massachusetts Lowell
Abstract:
This work presents an adaptation of the DepthAnythingV2 model for precise monocular depth estimation in colonoscopy procedures, addressing the unique challenges posed by the complex and dynamic visual characteristics of the colon. Recognizing that depth estimation methods designed for natural scenes do not transfer directly to medical imaging, we fine-tune the transformer-based DepthAnythingV2 model using the synthetic SimCol3D dataset with ground truth depth maps. Our approach demonstrates effective knowledge transfer to the endoscopic domain, achieving a mean absolute error as low as 0.002 on synthetic data. In addition, we generate detailed 3D meshes through offline processing, providing high-quality three-dimensional representations of the colon. Additionally, a pose estimation framework is developed to accurately capture the relative camera motion between consecutive frames, complementing the depth estimation module to enhance the overall 3D reconstruction process. Although the current system does not support real-time inference, the robust performance and offline reconstruction capabilities lay a solid foundation for future work aimed at integrating automated 3D reconstruction into clinical workflows, with the potential to improve diagnostic accuracy and patient outcomes in minimally invasive procedures.