Skip to content

Commit

Permalink
Building project page and Github readme
Browse files Browse the repository at this point in the history
  • Loading branch information
chcorbi committed Nov 25, 2024
1 parent f86a6d4 commit d8b0ea1
Show file tree
Hide file tree
Showing 33 changed files with 318 additions and 390 deletions.
97 changes: 87 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,93 @@
# Nerfies

This is the repository that contains source code for the [Nerfies website](https://nerfies.github.io).
# Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation

If you find Nerfies useful for your work please cite:
[![arXiv](https://img.shields.io/badge/arXiv-2403.16999-b31b1b.svg)](https://arxiv.org/abs/2403.16999)
[![Dataset](https://img.shields.io/badge/Dataset-Download-blue.svg)](https://github.com/vita-epfl/helvipad/releases)
[![Project Page](https://img.shields.io/badge/Project-Page-brightgreen)](https://vita-epfl.github.io/Helvipad/)

![Front Page](static/images/front_page.png)
## Abstract

Despite considerable progress in stereo depth estimation, omnidirectional imaging remains underexplored,
mainly due to the lack of appropriate data.
We introduce <span style="font-variant: small-caps;">Helvipad</span>,
a real-world dataset for omnidirectional stereo depth estimation, consisting of 40K frames from video sequences
across diverse environments, including crowded indoor and outdoor scenes with diverse lighting conditions.
Collected using two 360° cameras in a top-bottom setup and a LiDAR sensor, the dataset includes accurate
depth and disparity labels by projecting 3D point clouds onto equirectangular images. Additionally, we
provide an augmented training set with a significantly increased label density by using depth completion.
We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
Results show that while recent stereo methods perform decently, a significant challenge persists in accurately
estimating depth in omnidirectional imaging. To address this, we introduce necessary adaptations to stereo models,
achieving improved performance.

## Dataset Structure

The dataset is organized into training and testing subsets with the following structure:

```
helvipad/
├── train/
│ ├── camera_videos # Raw video footage
│ ├── depth_maps # Depth maps generated from LiDAR data
│ ├── depth_maps_augmented # Augmented depth maps using depth completion
│ ├── disparity_maps # Disparity maps computed from depth maps
│ ├── disparity_maps_augmented # Augmented disparity maps using depth completion
│ ├── images_top # Top-camera RGB images
│ ├── images_bottom # Bottom-camera RGB images
│ ├── LiDAR_pcd # Original LiDAR point cloud data
├── test/
│ ├── camera_videos # Raw video footage
│ ├── depth_maps # Depth maps generated from LiDAR data
│ ├── disparity_maps # Disparity maps computed from depth maps
│ ├── images_top # Top-camera RGB images
│ ├── images_bottom # Bottom-camera RGB images
│ ├── LiDAR_pcd # Original LiDAR point cloud data
```
@article{park2021nerfies
author = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
title = {Nerfies: Deformable Neural Radiance Fields},
journal = {ICCV},
year = {2021},


## Benchmark

We evaluate the performance of multiple state-of-the-art and popular stereo matching methods, both for standard and 360° images. All models are trained on a single NVIDIA A100 GPU with
the largest possible batch size to ensure comparable use of computational resources.

| Method | Type | Disp-MAE (°) | Disp-RMSE (°) | Disp-MARE | Depth-MAE (m) | Depth-RMSE (m) | Depth-MARE (m) |
|--------------------|----------------|--------------|---------------|-----------|---------------|----------------|----------------|
| [PSMNet](https://arxiv.org/abs/1803.08669) | Stereo | 0.33 | 0.54 | 0.20 | 2.79 | 6.17 | 0.29 |
| [360SD-Net](https://arxiv.org/abs/1911.04460) | 360° Stereo | 0.21 | 0.42 | 0.18 | 2.14 | 5.12 | 0.15 |
| [IGEV-Stereo](https://arxiv.org/abs/2303.06615) | Stereo | 0.22 | 0.41 | 0.17 | 1.85 | 4.44 | 0.15 |
| 360-IGEV-Stereo | 360° Stereo | **0.18** | **0.39** | **0.15** | **1.77** | **4.36** | **0.14** |

## Download

The dataset will be soon available for download [here](https://github.com/vita-epfl/helvipad/releases).


## Project Page

For more information, visualizations, and updates, visit the **[project page](https://vita-epfl.github.io/Helvipad/)**.

## Citation

If you use the Helvipad dataset in your research, please cite our paper:

```bibtex
@misc{zayene2024helvipad,
author = {Zayene, Mehdi and Endres, Jannik and Havolli, Albias and Corbière, Charles and Cherkaoui, Salim and Ben Ahmed Kontouli, Alexandre and Alahi, Alexandre},
title = {Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation},
year = {2024},
eprint = {2403.16999},
archivePrefix = {arXiv},
primaryClass = {cs.CV}
}
```

# Website License
<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
## License

This dataset is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).

## Acknowledgments

This project was developed at the [Visual Intelligence for Transportation Laboratory (VITA)](https://www.epfl.ch/labs/vita/) at EPFL.
We thank all VITA lab members for their insightful feedback and help in improving the quality of this manuscript.
We also express our gratitude to Dr. Simone Schaub-Meyer and Oliver Hahn for their advice towards the end of the project.
Loading

0 comments on commit d8b0ea1

Please sign in to comment.