Building project page and Github readme

vita-epfl · Nov 25, 2024 · d8b0ea1 · d8b0ea1
1 parent f86a6d4
commit d8b0ea1
Show file tree

Hide file tree

Showing 33 changed files with 318 additions and 390 deletions.
diff --git a/README.md b/README.md
@@ -1,16 +1,93 @@
-# Nerfies
 
-This is the repository that contains source code for the [Nerfies website](https://nerfies.github.io).
+# Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation
 
-If you find Nerfies useful for your work please cite:
+[![arXiv](https://img.shields.io/badge/arXiv-2403.16999-b31b1b.svg)](https://arxiv.org/abs/2403.16999)
+[![Dataset](https://img.shields.io/badge/Dataset-Download-blue.svg)](https://github.com/vita-epfl/helvipad/releases)
+[![Project Page](https://img.shields.io/badge/Project-Page-brightgreen)](https://vita-epfl.github.io/Helvipad/)
+
+![Front Page](static/images/front_page.png)
+## Abstract
+
+Despite considerable progress in stereo depth estimation, omnidirectional imaging remains underexplored,
+mainly due to the lack of appropriate data.
+We introduce <span style="font-variant: small-caps;">Helvipad</span>,
+a real-world dataset for omnidirectional stereo depth estimation, consisting of 40K frames from video sequences
+across diverse environments, including crowded indoor and outdoor scenes with diverse lighting conditions.
+Collected using two 360° cameras in a top-bottom setup and a LiDAR sensor, the dataset includes accurate
+depth and disparity labels by projecting 3D point clouds onto equirectangular images. Additionally, we
+provide an augmented training set with a significantly increased label density by using depth completion.
+We benchmark leading stereo depth estimation models for both standard and omnidirectional images.
+Results show that while recent stereo methods perform decently, a significant challenge persists in accurately
+estimating depth in omnidirectional imaging. To address this, we introduce necessary adaptations to stereo models,
+achieving improved performance.
+
+## Dataset Structure
+
+The dataset is organized into training and testing subsets with the following structure:
+
+```
+helvipad/
+├── train/
+│   ├── camera_videos             # Raw video footage
+│   ├── depth_maps                # Depth maps generated from LiDAR data
+│   ├── depth_maps_augmented      # Augmented depth maps using depth completion
+│   ├── disparity_maps            # Disparity maps computed from depth maps
+│   ├── disparity_maps_augmented  # Augmented disparity maps using depth completion
+│   ├── images_top                # Top-camera RGB images
+│   ├── images_bottom             # Bottom-camera RGB images
+│   ├── LiDAR_pcd                 # Original LiDAR point cloud data
+├── test/
+│   ├── camera_videos             # Raw video footage
+│   ├── depth_maps                # Depth maps generated from LiDAR data
+│   ├── disparity_maps            # Disparity maps computed from depth maps
+│   ├── images_top                # Top-camera RGB images
+│   ├── images_bottom             # Bottom-camera RGB images
+│   ├── LiDAR_pcd                 # Original LiDAR point cloud data
 ```
-@article{park2021nerfies
-  author    = {Park, Keunhong and Sinha, Utkarsh and Barron, Jonathan T. and Bouaziz, Sofien and Goldman, Dan B and Seitz, Steven M. and Martin-Brualla, Ricardo},
-  title     = {Nerfies: Deformable Neural Radiance Fields},
-  journal   = {ICCV},
-  year      = {2021},
+
+
+## Benchmark
+
+We evaluate the performance of multiple state-of-the-art and popular stereo matching methods, both for standard and 360° images. All models are trained on a single NVIDIA A100 GPU with
+the largest possible batch size to ensure comparable use of computational resources.
+
+| Method             | Type           | Disp-MAE (°) | Disp-RMSE (°) | Disp-MARE | Depth-MAE (m) | Depth-RMSE (m) | Depth-MARE (m) |
+|--------------------|----------------|--------------|---------------|-----------|---------------|----------------|----------------|
+| [PSMNet](https://arxiv.org/abs/1803.08669)           | Stereo        | 0.33         | 0.54          | 0.20      | 2.79          | 6.17           | 0.29           |
+| [360SD-Net](https://arxiv.org/abs/1911.04460)        | 360° Stereo   | 0.21         | 0.42          | 0.18      | 2.14          | 5.12           | 0.15           |
+| [IGEV-Stereo](https://arxiv.org/abs/2303.06615)      | Stereo        | 0.22         | 0.41          | 0.17      | 1.85          | 4.44           | 0.15           |
+| 360-IGEV-Stereo    | 360° Stereo   | **0.18**     | **0.39**      | **0.15**  | **1.77**      | **4.36**       | **0.14**       |
+
+## Download
+
+The dataset will be soon available for download [here](https://github.com/vita-epfl/helvipad/releases).
+
+
+## Project Page
+
+For more information, visualizations, and updates, visit the **[project page](https://vita-epfl.github.io/Helvipad/)**.
+
+## Citation
+
+If you use the Helvipad dataset in your research, please cite our paper:
+
+```bibtex
+@misc{zayene2024helvipad,
+  author        = {Zayene, Mehdi and Endres, Jannik and Havolli, Albias and Corbière, Charles and Cherkaoui, Salim and Ben Ahmed Kontouli, Alexandre and Alahi, Alexandre},
+  title         = {Helvipad: A Real-World Dataset for Omnidirectional Stereo Depth Estimation},
+  year          = {2024},
+  eprint        = {2403.16999},
+  archivePrefix = {arXiv},
+  primaryClass  = {cs.CV}
 }
 ```
 
-# Website License
-<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.
+## License
+
+This dataset is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-sa/4.0/).
+
+## Acknowledgments
+
+This project was developed at the [Visual Intelligence for Transportation Laboratory (VITA)](https://www.epfl.ch/labs/vita/) at EPFL. 
+We thank all VITA lab members for their insightful feedback and help in improving the quality of this manuscript. 
+We also express our gratitude to Dr. Simone Schaub-Meyer and Oliver Hahn for their advice towards the end of the project.