Learning Gaussian Noise Models for State Estimation
The uncertainty of measurements varies can vary with the quality of input, but predicting uncertainty is challenging
Autonomous vehicles often carry highly complex sensors that involve an internal processing stage to turn high dimensional sensor data into a direct observation of the state. For example, odometery sensors based on RGB-D (480 by 640 image with 3 color channels and one depth channel) may utilize high dimensional raw data to minimize photometric error. Due to the complexity of the measurement processing stage, the accuracy of the measurements are affected by the state of the environment. For example, incremental translational error in estimated visual odometry can be larger in scenes with a walking person if features on the person are assumed to be static, as visualized below. However, hand-coding rules and manually specifying features is difficult, especially for non-domain experts.
Figure: A person walking in front of the camera can add error to an odometery system that is not designed to reject dynamic objects.
Figure: Lack of texture or depth features can also add error to visual odometery. Furthermore, different algorithms may output different types of error given no visual information.
Deep Inference for Covariance Estimation (DICE) uses a convolutional neural network to predict noise models for a given sensor input
We assume measurement noise is normally distributed, and approximate the complex mapping between raw sensor measurements and the noise model using a feed-forward convolutional neural network. If ground-truth covariances were available, we could directly train the network using covariances as labels. However, the true measurement covariances are often difficult to obtain. Instead, it can be much easier to obtain measurement errors from the distribution. We use measurement errors to weakly supervise covariance prediction.
DICE predicts qualitatively reasonable noise estimates and improves state estimation
Figure: DICE covariance estimates shown in red, and an empirical covariance calculated over the dataset in blue. The covariances are visualized on the camera trajectory (shown in black). Unlike the empirically calculated constant covariance, covariances estimated by DICE increase when the scene is dynamic, and reduce when there is a feature rich static environment.
To learn more about this project check out our paper or ICRA spotlight video.
Related Threads:
Here are a few related projects that I led or collaborated on.
Weakly Supervised Learning. One of the most straight-forward methods to train deep neural networks is to formulate a fully supervised learning problem, where labels are available for the exact quantity of interest (for example, a covariance). However, for many robotics problems, such labels are extremely difficult to obtain due to the resources and overhead required. I consider weakly supervised learning to encompass methods which use indirect labels of the quantity of interest to optimize network parameters (for example, observed error measurements). Elsewhere, I’ve explored using some flavor of weak supervision for navigation (see here) and to predict 3D object parameters (see here).
Visual Estimation. For more computer vision projects, check out VoluMon (see here) and ROSHAN (see here).