Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace DSP ClinVar ingest pipeline in production with the ClinGen version #138

Open
toneillbroad opened this issue May 17, 2024 · 0 comments
Labels

Comments

@toneillbroad
Copy link
Contributor

This Epic about replacing the current DSP written ClinVar ingest pipeline with the ClinGen written one.

Definitionally, the ClinGen pipeline components are:

  • clinvar-ftp-watcher - detects that a new ClinVar release has dropped and initiates the ClinGen ClinVar processing pipeline
  • clinvar-ingest - this is the ingest workflow that processing the ClinVar release file and produces the BQ tables
  • Core BigQuery tables - 14 or so distinct ClinVar entities
  • BigQuery supplemental tables, based on the core BQ tables, used by the GenomeConnect reports, GK Pilot files.

Environmentally, we've agreed that the following constraints apply:

  • The clingen-dev (dev) GCP cluster is the dev environment for the ClinGen ClinVar pipeline processing
  • The clingen-stage (stage) environment remains the current "production" DSP processing. This is the basis of all of the work that Larry does for generating the CvC curation project and all of the reports he produces.
  • The clingen-dx (prod) cluster is the production environment that will subsume all of the DSP processing replaced by the ClinGen ingest pipeline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants