Skip to content

sci-visus/NSDF-WIRED

Repository files navigation

WIRED Global

Welcome to NSDF's WIRED Global Center GitHub repository. NSDF and WIRED Global aim to connect climate, weather, power grid, and social data with essential software tools and computing resources. We will empower researchers with advanced tools for data acquisition, storage, management, integration, mining, and visualization.

Table of Contents

Overview

Here you will find current progress on the transformation of climate data from various diverse sources to the OpenVisus framework.

Our current goal is the data curation of Smoke Forecasts from The Weather Forecast Research Team at the University of British Columbia.

Current Tasks

  • Providing metadata about timestamps' last forecast update and resampling truth value.
  • Paralellization of and extension of conversion script to be robust to incorporating new timestamps.

Project Structure

Curating the set of netCDF files containing Smoke Forecasts to a single 3D array stored in OpenVisus' IDX format requires iterative refinement. We return to different steps in the process to produce a higher quality curation each time, hence various versions of the steps outlined below.

Each step is described by it's directory name, see below what each directory contains:

  • data_download: Script about downloading the netCDF files from FireSmoke Canada.
  • conversion: Versions of our script to convert the hundreds of netCDF files to a single IDX file.
  • visualizations: Quick and dirty visualizations to inspect our data under different contexts. Subdirectory demos has versions of self-contained demos using the data in IDX format.
  • data_quality: Scripts toward identifying gaps and issues like silent corruption.
  • data: netCDF file for final use, non-interim.
  • scribbles.ipynb is a notebook that serves as a scratch pad, code here may or may not be incoporated into the workflows above.

Check the readme.md of each directory to see further details about each step.

Directory Tree, 2 Levels Down

.
├── conda_environment.yml
├── conversion
│   ├── conversion_sequence_debug.ipynb
│   ├── firesmoke_to_idx2.ipynb
│   ├── firesmoke_to_idx_v1.ipynb
│   ├── firesmoke_to_idx_v2.ipynb
│   ├── firesmoke_to_idx_v3.ipynb
│   ├── firesmoke_to_idx_v4.ipynb
│   ├── westerncanada_parallel_idx.ipynb
│   └── readme.md
├── data
│   ├── firesmoke_metadata_current.nc
│   └── readme.md
├── data_download
│   ├── get_data_v0.py
│   ├── get_data_v1.py
│   ├── get_data_v1-westerncanada.py
│   └── readme.md
├── data_quality
│   ├── data_quality
│   ├── metadata_creation
│   └── readme.md
├── readme.md
├── scribbles.ipynb
└── visualizations
    ├── demos
    ├── firesmoke-dashboard.ipynb
    ├── firesmoke-viz.ipynb
    ├── firesmoke_idx-viz.ipynb
    ├── make_videos
    └── readme.md

Implementation Notes

  • This work was done in Python 3.9.18., conda_environment.yml is the conda environment we have used during the development of this project.
  • Data curation has been run using SCI Institute resources, in particular, files are downloaded to those systems and processed there due to the large volume of data.
  • Running the workflow described above may be difficult on a consumer-grade machine. Consider using small batches of data to test and via University of Utah and/or University of British Columbia nodes, full data processing can be run.
  • Ensure you change directory names within notebooks to fit your work environment accordingly.

Contributing

The nature of data curation means new issues or requirements are found. Please feel free to make push requests if you are confident about a script and/or edit that has improved one of the steps above. Please be sure to describe in concrete terms what changes your work yields.

Contact

Please feel free to contact us here for detailed information:

Related Works

1 . Pascucci, Valerio, et al. "The ViSUS visualization framework." High Performance Visualization. Chapman and Hall/CRC, 2012. 439-452. Here