Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid importing both rasterio and osgeo #1351

Open
groutr opened this issue Nov 26, 2024 · 6 comments · Fixed by #1382
Open

Avoid importing both rasterio and osgeo #1351

groutr opened this issue Nov 26, 2024 · 6 comments · Fixed by #1382
Assignees
Labels
enhancement New feature or request

Comments

@groutr
Copy link

groutr commented Nov 26, 2024

There are several places in this codebase that import both rasterio (or libraries that use rasterio) and osgeo gdal bindings. This is highly discouraged by the rasterio team.

https://rasterio.readthedocs.io/en/stable/topics/switch.html#mutual-incompatibilities

GDAL’s bindings (gdal for the rest of this document) and Rasterio are not entirely compatible and should not, without a great deal of care, be imported and used in a single Python program. The reason is that the dynamic library they each load (these are C extension modules, remember), libgdal.so on Linux, gdal.dll on Windows, has a number of global objects and the two modules take different approaches to managing these objects.

ping: @RobHanna-NOAA

@groutr groutr changed the title Avoid mixing rasterio and osgeo imports Avoid importing both rasterio and osgeo Nov 26, 2024
@RobHanna-NOAA RobHanna-NOAA added the enhancement New feature or request label Nov 26, 2024
@RobHanna-NOAA
Copy link
Contributor

This could potentially have significant performance impacts.

@RobHanna-NOAA
Copy link
Contributor

Not sure we have any or many that use both of those in one python file. That being said, we have a bunch of places that use gdal directly via shell commands in bash, which often do command line calls to python files to do more processing. Certainly worth looking in to.

@mluck
Copy link
Contributor

mluck commented Dec 19, 2024

The imports don't necessarily have to happen in the same file, but rather in the same python interpreter session.

Scripts that do this: https://github.com/NOAA-OWP/inundation-mapping/blob/1fa128f84460a7e3b15885a72e8a006d451812b3/data/create_vrt_file.py (imports shared_functions which imports rasterio)

https://github.com/NOAA-OWP/inundation-mapping/blob/1fa128f84460a7e3b15885a72e8a006d451812b3/src/make_rem.py (imports both libraries)

https://github.com/NOAA-OWP/inundation-mapping/blob/1fa128f84460a7e3b15885a72e8a006d451812b3/tools/inundate_nation.py (imports both)

https://github.com/NOAA-OWP/inundation-mapping/blob/1fa128f84460a7e3b15885a72e8a006d451812b3/data/ble/ble_benchmark/create_ble_benchmark.py (imports preprocess_benchmark which imports rasterio)

https://github.com/NOAA-OWP/inundation-mapping/blob/1fa128f84460a7e3b15885a72e8a006d451812b3/data/bathymetry/preprocess_bathymetry.py (imports both)

Only one of these files (src/make_rem.py) is used in the main FIM pipeline (fim_process_unit_wb.sh), and in that file GDAL was imported but never used. In addition, only one other file (src/getRasterInfoNative.py) imported gdal so that file has been re-written to use rasterio (in PR #1382) so now there are no GDAL imports in FIM pipeline until post-processing.

@CarsonPruitt-NOAA CarsonPruitt-NOAA linked a pull request Jan 3, 2025 that will close this issue
20 tasks
@mluck
Copy link
Contributor

mluck commented Jan 7, 2025

Reopening this issue since #1382 resolved the issues in FIM pipeline but there are additional cases in pre- and post-processing that are still being addressed.

@mluck mluck reopened this Jan 7, 2025
@mluck
Copy link
Contributor

mluck commented Jan 7, 2025

I was getting tripped up trying to test replacing computing slopes with gdal.DEMProcessing() in

slope_tif = gdal.DEMProcessing(output_slope_tif, bathy_gdal, 'slope', format='GTiff')
for parity with the TauDEM d8flowdir that is used to compute slopes in
mpiexec -n $ncores_fd $taudemDir2/d8flowdir \

When compared with Whitebox slopes, it looks like TauDEM is resulting in some abnormal slope values. For example,
image
The covariance matrix shows that TauDEM is producing some outliers (r~0.916) while GDAL and Whitebox are producing more agreeable rasters (r=0.997).
N = 19237458

-- GDAL TauDEM Whitebox
GDAL 1.000000 0.916421 0.996747
TauDEM 0.916421 1.000000 0.915627
Whitebox 0.996747 0.915627 1.000000

As a next step, I am going to investigate replacing both GDAL and TauDEM slopes with Whitebox.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants