Algorithms for Robust Geometry Estimation in AR Applications

Yeh, Shu-Hao

The full text of this item is not available at this time because the student has placed this item under an embargo for a period of time. The Libraries are not authorized to provide a copy of this work during the embargo period, even for Texas A&M users with NetID.

View/ Open

YEH-DISSERTATION-2023.pdf (8.047Mb)

Date

2023-03-16

Author

Yeh, Shu-Hao

Metadata

Show full item record

Abstract

Augmented Reality (AR) integrates virtual content into our actual world. When running AR applications in an unknown environment, AR must construct an environment map and localize itself under this map. It is the classic Simultaneous Localization and Mapping (SLAM) problem. For AR applications, visual SLAM is often the primary choice since cameras not only can assist in capturing reality but also are the standard configuration in AR devices. Existing approaches have shown successful visual SLAM algorithms with high accuracy and in real-time, but visual SLAM is still not robust and inevitably fails from time to time. The major challenge of robustness comes from scene limitations which often lead to uneven feature distribution. Another challenge is weak epipolar constraint which is the most commonly used geometric constraint in recovering camera pose. To address these challenges, we investigate how to improve each step in visual SLAM. The basic rationale is that the result of visual SLAM is established on the sequential result returned from every step. Despite existing efforts in improving camera pose estimation, the best existing approach that builds on random consensus sampling (RANSAC) still has a non-negligible failure rate. To enhance the robustness of the estimation algorithm, we propose a new Robust Camera Motion Estimator (RCME) by incorporating model uncertainty in the RANSAC framework. There are two main changes: a model-sample consistency test at the model instantiation step and an inlier set quality test that verifies model-inlier consistency using differential entropy. Our algorithm shows a consistent reduction in failure rate when compared to the RANSAC-based Gold Standard approach and two recent variations of RANSAC methods. AR applications often rely on markers to define the world coordinate system for calibration or object registration. However, printing and deploying Artificial Landmarks (ALs) requires strong computer vision expertise. Inspired by the rigidity, the low-cost, the precise manufacture, and the wide availability of LEGO baseplates, we propose LEGO baseplates to be new ALs for AR applications. To overcome the issues that LEGO baseplates are monochromatic with low contrast and easily affected by lighting, we utilize geometric and semantic information in our algorithm design by leveraging the grid pattern, circle stud shapes, and text patterns. Our algorithm has extensively utilized the information for cross-validation in noise filtering and position refinement using robust estimation methods. Our algorithm achieves more than 95% stud center recovery as feature points to ensure pose estimation accuracy. Our results also show that LEGO baseplate produces significantly more accurate camera pose estimation results than that of existing state-of the-art counterpart when both methods are deployed by users with no computer vision background. However, the first two works will be challenged by poor image quality. As an ubiquitous mechanism in mobile devices to maintain image quality when hand jittering occurs, Optical Image Stabilization (OIS) consequently changes camera intrinsic parameters, which violates a precise camera model requirement for mapping purpose. Being aware of its existence is important before a high-order SLAM model is applied. We present a two-step approach to detect if an image conforms to a given camera model, including distortion coefficient and intrinsic matrix, by developing two statistical hypothesis testings. Our algorithm achieves 85.4% recall and 100% precision in detecting model inconsistency and the existence of OIS system successfully. With a great understanding of the camera model inconsistency introduced by the OIS system, we focus on solving the dynamic camera model under the OIS effect, which hinders accurate camera pose estimation or 3D reconstruction. We propose a novel neural network-based approach that predicts the dynamic camera matrix in real time so that pose estimation or scene reconstruction can run at camera native resolution for the highest accuracy on mobile devices. Our network takes the gridified projection model discrepancy feature and 3D point positions as inputs and employs a Multi-Layer Perceptron (MLP) to approximate the dynamic intrinsic manifold. We design a unique training scheme by introducing a Back-propagated PnP (BPnP) layer [1] so that reprojection error can be adopted as the loss function. The training scheme is done by using precise calibration patterns to capture the accurate manifold, but the trained network can be used anywhere. We name the Dynamic Intrinsic Manifold Estimation network as DIME-Net. Our algorithm achieves at least 64% reduction in reprojection error.

Citation

Yeh, Shu-Hao (2023). Algorithms for Robust Geometry Estimation in AR Applications. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /198921.