Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define GitHub actions workflows for building and testing dbt #50
Define GitHub actions workflows for building and testing dbt #50
Changes from 52 commits
88494fd
997f3e4
bc0e000
ff4fd81
297f7ec
0396849
f833271
3fb81f6
5b37bf8
e8ec9d0
f4ba9de
8530109
b26cefc
3ff6d98
0607be7
98a42ef
03e4dc1
b99d0da
829a456
cb0299b
31de4e0
7b0037a
9bf6cac
76db394
4d552aa
27f556c
15a86d2
7ab4b69
9304b2f
b291800
b4ec438
52fc11a
640387f
dfe8785
6d685a0
1e5331d
620877b
f2d0508
cf5995b
d34c83b
d2909ea
7538f9e
c3a7a62
e10540f
b5cec8d
68f914c
1d4ec5a
8af875c
87353b2
37ba0bb
b36fec7
5bdf9a2
cef322a
e51c677
f9b5d86
704ed13
1738374
672a7d4
60c2bb1
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not strictly necessary for us to factor out these variables, since we don't currently have plans to reuse this action outside of our current dbt project, but the GitHub Actions docs recommend using variables instead of hardcoded paths so I'm doing so here:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually didn't know you could re-use in-repo actions. I sillily used a composable workflow in PTAXSIM instead. Will fix: ccao-data/ptaxsim#15. Thanks for teaching me something today!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly, both workflows share a common set of environment variables that are loaded using this action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workflow represents the build and test that should run on every PR and push to our main branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a number of references to
data-catalog
here that cover for the fact that we don't expect this workflow to be pulled intomaster
immediately, and that we want to treatdata-catalog
as a CI environment until we're ready to mergedata-catalog
intomaster
(#55). Part of that merge issue will be stripping out the references todata-catalog
in these workflows and ensuring that they are ready to be run againstmaster
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that these event types represent the default event types plus
closed
. Docs:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More details on these permissions settings here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The
restore-keys
setting here means that we will fall back to thedata-catalog
ormaster
branch caches if the PR branch cache does not exist (which we only ever expect to happen on the first pushes to a PR, before the PR has had a successful run).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of things are going on here:
--target
is ensuring that we're running against the correct environment (CI or prod)-s state:modified
is only selecting resources that have changed since the last run--defer
is instructing dbt to reuse resources that are marked as created in the state file, even if they don't exist in the current environmentdata-catalog
branch, dbt will reuse resources created in thedata-catalog
environment without needing to recreate them in the PR environment (docs)--state
instructs dbt on where to load the state file, which we keep cachedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workflow represents the tests that we will want to trigger from our nightly import process (ccao-data/service-sqoop-iasworld#1). It is not fully tested yet as part of this project, since manual workflow dispatch is not possible until the workflow has been pulled into the main branch (source). Instead, I tested the workflow by manually editing the
on:
key below to run the workflow on pushes to this pull request; test output here.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have confirmed by running these tests locally that these changes represent additional malformed data that has been added to our source data since this PR opened. The fact that this happened twice during the ~week that this PR was open is an indication to me that I may want to pivot soon to focusing on resolving these data problems, since they're going to become annoying quite quickly once we're running these tests on every PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion (non-blocking): Some of these are likely just under-specified tests, but I agree it's going to be a problem. Grab @wrridgeway this week for a review/debugging session of the current tests. He knows more than anyone but Mirella about the horrors of our changing data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call, I'll set aside some time for us tomorrow 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I scheduled some time tomorrow afternoon! In the meantime, I had to bump the thresholds yet again in 672a7d4☹️