-
Notifications
You must be signed in to change notification settings - Fork 1
Workflow for new data addition
brandon whitehead edited this page Sep 10, 2024
·
12 revisions
Lead | Do | Measure | |
---|---|---|---|
Input | |||
find data | Identify what kinds of data we are interested in and where other people who are interested in that same kind of data are at. | Talk with people, search data archives, and/or read the literature to find relevant datasets. | Do people follow up with conversations? Is there engagement? Are we finding data that matches our needs? |
Open Ticket | Collect all available information on the data set. | Document dataset in a new ticket (Issue). | Can other people find the dataset? Do they know where to go for information about the data? |
Evaluate | Place data contribution on context of desired product. | Extend the ticket to include how it overlaps. | Can we make a decision on prioritization of this specific data contribution over other candidates? |
Transformation | |||
Annotations | Read though documentation. | Annotate the dataset with id-variable-type-entry tuple. | Do the annotations match the data? Are the level of method descriptions clear? |
Read script | Understand the data model. | Transform the dataset into a standardized id-variable-type-entry tuple. | Are the transformations that the data went through clear (good comments)? |
Integration | Identify comparable variable-methods. | Integrate the new data collection. | Is it clear how the variables connected across the data and why? Could someone else reasonably agree with your decisions? |
Output | |||
QA/QC | Identify needed visuals, min/max, and control vocabulary checks for new data contribution. | QA/QC the data collection with this new data contribution. | How does the data contribution compare with the larger collection? |
merge to master | Create pull request and identify reviewer | Merge into the main branch. | Did the main branch break? Does the collection still work? Can folks assemble the collection from scripts |
publish | Identify who is interested in the new updates | Announce new data availability. | Did we get new downloads of the repository? Were there questions that need to be added to the documentation that we missed? |
-
Workflow for new data additions
- Find data
- Open ticket
- Evaluation
- Annotations
- Read scripts
- Integration
- QA/QC
- Merge to main
- Publish
- Data collections