Neural Disparity Refinement for Arbitrary Resolution Stereo

Filippo Aleotti, Fabio Tosi, Pierluigi Zama Ramirez*
Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano
*Equal Contribution
University of Bologna

Best Paper Honorable Mention at 3DV 2021

We gratefully acknowledge the funding support of Huawei Technologies Oy (Finland).

PAPER

Neural Disparity Refinement for Arbitrary Resolution Stereo

Filippo Aleotti*, Fabio Tosi*, Pierluigi Zama Ramirez*, Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano
*Equal Contribution

We introduce a novel architecture for neural disparity refinement aimed at facilitating deployment of 3D computer vision on cheap and widespread consumer devices, such as mobile phones. Our approach relies on a continuous formulation that enables to estimate a refined disparity map at any arbitrary output resolution. Thereby, it can handle effectively the unbalanced camera setup typical of nowadays mobile phones, which feature both high and low resolution RGB sensors within the same device. Moreover, our neural network can process seamlessly the output of a variety of stereo methods and, by refining the disparity maps computed by a traditional matching algorithm like SGM, it can achieve unpaired zero-shot generalization performance compared to state-of-the-art end-to-end stereo models.

CITATION

@inproceedings{aleotti2021neural,
    title={Neural Disparity Refinement for Arbitrary Resolution Stereo},
    author={Aleotti, Filippo and Tosi, Fabio and Zama Ramirez, Pierluigi and Poggi, Matteo and Salti, Samuele and Di Stefano, Luigi and Mattoccia, Stefano},
    booktitle={International Conference on 3D Vision},
    note={3DV},
    year={2021},
}

METHOD OVERVIEW

Neural Disparity Refinement, architecture overview. Given a rectified stereo pair captured using either a balanced or unbalanced (red dotted lines) stereo setting, our goal is to estimate a refined disparity map at any arbitrary spatial resolution starting from noisy disparities pre-computed by any existing stereo blackbox. We first extract deep high-dimensional features using two separate convolutional branches that are combined together by a decoder. Then, at each continuous 2D location in the image domain, we interpolate features across the levels of the decoder in order to feed them into a disparity estimation module realized through two MLPs which predict an integer disparity value and a sub-pixel offset, respectively.

RESULTS

SCENEFLOW

Results on the SceneFlow testing set. From left to right, the RGB input image, the noisy input disparity map computed by SGM (rows 1-2), AD-Census (rows 3-4), C-CNN (rows 5-6) and the corresponding refined disparity estimated by our network.

MIDDLEBURY

Generalization results on Middlebury 2014 of our network (pre-trained on SceneFlow). From left to right, the RGB input image, the noisy input disparity map computed by SGM and the refined disparity estimated by our network.

KITTI

Generalization results on KITTI 2015 of our network (pre-trained on SceneFlow). From left to right, the RGB input image, the noisy input disparity map computed by SGM and the refined disparity estimated by our network.

UNBALANCE SETUP

Our network is able to handle also inputs with different resolutions. The top row depicts the input image, at 3840 × 2160 and the disparity maps, D, computed by SGM when the right image, is 480 × 270 and 320 × 180 (k = 8 and 12). The bottom row shows ground-truth and estimated disparity at 3840×2160.

CONTINOUS UPSAMPLING

Thanks to our continuous formulation we can estimate a disparity map at an arbitrary resolution. In the figure we see the comparison against a standard Nearest Neighbor interpolation.

US

Filippo Aleotti

PhD Student

University of Bologna

filippo.aleotti2@unibo.it

Fabio Tosi

Post Doc

University of Bologna

fabio.tosi5@unibo.it

Pierluigi Zama Ramirez

Post Doc

University of Bologna

pierluigi.zama@unibo.it

Matteo Poggi

Assistant Professor

University of Bologna

m.poggi@unibo.it

Samuele Salti

Professor

University of Bologna

samuele.salti@unibo.it

Stefano Mattoccia

Professor

University of Bologna

stefano.mattoccia@unibo.it

Luigi Di Stefano

Full professor

University of Bologna

luigi.distefano@unibo.it

Neural Disparity Refinement for Arbitrary Resolution Stereo

Filippo Aleotti*, Fabio Tosi*, Pierluigi Zama Ramirez* Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano *Equal Contribution University of Bologna Best Paper Honorable Mention at 3DV 2021 We gratefully acknowledge the funding support of Huawei Technologies Oy (Finland).

PAPER

CITATION

METHOD OVERVIEW

RESULTS

SCENEFLOW

MIDDLEBURY

KITTI

UNBALANCE SETUP

CONTINOUS UPSAMPLING

US

Filippo Aleotti, Fabio Tosi, Pierluigi Zama Ramirez*
Matteo Poggi, Samuele Salti, Stefano Mattoccia, Luigi Di Stefano
*Equal Contribution
University of Bologna

Best Paper Honorable Mention at 3DV 2021

We gratefully acknowledge the funding support of Huawei Technologies Oy (Finland).