Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backfill nearest neighbor outputs for single dates #1333

Merged
merged 1 commit into from
Dec 23, 2024
Merged

Conversation

tiffanychu90
Copy link
Member

  • Trial several ways to put together the nearest neighbor script, including dask.map_partitions, itertuples, unpacking iterables, list comprehension, and the existing np.vectorize
  • Use a dict to figure out the opposite direction to exclude for stop_primary_direction...move out of the loop, this is the only change that actually saves a couple of minutes
  • Fix minor details: drop null shape geometries
  • Backfill all the single dates for all 3 types of segments for the nearest neighbor output / delete stage2b outputs from GCS
  • Epic - GTFS analytics pipeline performance improvements #1315

@tiffanychu90 tiffanychu90 merged commit 2b18dd1 into main Dec 23, 2024
2 checks passed
@tiffanychu90 tiffanychu90 deleted the backfill-nn branch December 23, 2024 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant