-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adjust dbt profiles and schema macro to support prod/CI targets #44
Changes from all commits
1fd82dd
902f61d
62dc2b3
181e516
2f57842
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,16 +1,57 @@ | ||
-- Override the default schema naming to remove the dbt-added prefix. | ||
-- Override the default schema naming to remove the autogenerated prefix | ||
-- and replace it with our own namespacing on dev and CI. | ||
-- See: https://docs.getdbt.com/docs/build/custom-schemas | ||
{% macro generate_schema_name(custom_schema_name, node) -%} | ||
{{ return(_generate_schema_name( | ||
custom_schema_name, | ||
node, | ||
target, | ||
env_var, | ||
exceptions.raise_compiler_error | ||
)) }} | ||
{%- endmacro %} | ||
|
||
-- Helper implementation of generate_schema_name where no arguments | ||
-- are read from the global variable context. Useful for unit testing. | ||
{% macro _generate_schema_name( | ||
custom_schema_name, node, target, env_var_func, raise_error_func | ||
) %} | ||
{# | ||
According to the dbt docs linked above, this is required to be set by | ||
the built-in macro that we are overriding, but we don't actually use it | ||
#} | ||
{%- set default_schema = target.schema -%} | ||
|
||
{%- if target.name == "dev" -%} | ||
{%- set schema_prefix = "dev_" ~ env_var_func("USER") ~ "_" -%} | ||
{%- elif target.name == "ci" -%} | ||
{%- set github_head_ref = kebab_slugify( | ||
env_var_func("GITHUB_HEAD_REF") | ||
) -%} | ||
{%- set schema_prefix = "ci_" ~ github_head_ref ~ "_" -%} | ||
{%- else -%} | ||
{%- set schema_prefix = "" -%} | ||
{%- endif -%} | ||
|
||
{%- if custom_schema_name is none -%} | ||
|
||
{{ default_schema }} | ||
{# | ||
The default schema name is not allowed, since we use subdirectory | ||
organization to map tables/views to their Athena database | ||
#} | ||
{{ return(raise_error_func( | ||
"Missing schema definition for " ~ node.name ~ ". " ~ | ||
"Its containing subdirectory is probably missing a `+schema` " ~ | ||
"attribute under the `models` config in dbt_project.yml." | ||
)) }} | ||
|
||
{%- else -%} | ||
|
||
{{ custom_schema_name | trim }} | ||
{%- set full_schema_name -%} | ||
{{ schema_prefix ~ custom_schema_name | trim }} | ||
{%- endset -%} | ||
|
||
{%- endif -%} | ||
{{ return(full_schema_name) }} | ||
|
||
{%- endmacro %} | ||
{%- endif -%} | ||
{% endmacro %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
-- Variation of dbt_utils.slugify macro using kebab-case instead of snake_case | ||
{% macro kebab_slugify(string) %} | ||
{#- Lower case the string -#} | ||
{% set string = string | lower %} | ||
|
||
{#- Replace spaces, slashes, and underscores with hyphens -#} | ||
{% set string = modules.re.sub('[ _/]+', '-', string) %} | ||
|
||
{#- Only take letters, numbers, and hyphens -#} | ||
{% set string = modules.re.sub('[^a-z0-9-]+', '', string) %} | ||
|
||
{#- Prepends "_" if string begins with a number -#} | ||
{% set string = modules.re.sub('^[0-9]', '_' + string[0], string) %} | ||
|
||
{{ return(string) }} | ||
{% endmacro %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
-- dbt macro unit testing inspired by this blog post: | ||
-- https://docs.getdbt.com/blog/unit-testing-dbt-packages | ||
{% macro test_all() %} | ||
{% do test_slugify() %} | ||
{% do test_generate_schema_name() %} | ||
{% endmacro %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
{% macro test_generate_schema_name() %} | ||
{% do test_generate_schema_name_handles_dev_env() %} | ||
{% do test_generate_schema_name_handles_ci_env() %} | ||
{% do test_generate_schema_name_handles_prod_env() %} | ||
{% do test_generate_schema_name_raises_for_default_schema_name() %} | ||
{% endmacro %} | ||
|
||
{% macro mock_env_var(var_name) %} | ||
{% if var_name == "USER" %} | ||
{{ return("testuser") }} | ||
{% elif var_name == "GITHUB_HEAD_REF" %} | ||
{{ return("testuser/feature-branch-1") }} | ||
{% else %} | ||
{{ return("") }} | ||
{% endif %} | ||
{% endmacro %} | ||
|
||
{% macro mock_raise_compiler_error(_error) %} | ||
{{ return("Compiler error raised") }} | ||
{% endmacro %} | ||
|
||
{% macro test_generate_schema_name_handles_dev_env() %} | ||
{% do assert_equals( | ||
"test_generate_schema_name_handles_dev_env", | ||
_generate_schema_name( | ||
"test", | ||
{"name": "test"}, | ||
{"schema": "default", "name": "dev"}, | ||
mock_env_var, | ||
exceptions.raise_compiler_error | ||
), | ||
"dev_testuser_test" | ||
) %} | ||
{% endmacro %} | ||
|
||
{% macro test_generate_schema_name_handles_ci_env() %} | ||
{% do assert_equals( | ||
"test_generate_schema_name_handles_ci_env", | ||
_generate_schema_name( | ||
"test", | ||
{"name": "test"}, | ||
{"schema": "default", "name": "ci"}, | ||
mock_env_var, | ||
exceptions.raise_compiler_error | ||
), | ||
"ci_testuser-feature-branch-1_test" | ||
) %} | ||
{% endmacro %} | ||
|
||
{% macro test_generate_schema_name_handles_prod_env() %} | ||
{% do assert_equals( | ||
"test_generate_schema_name_handles_prod_env", | ||
_generate_schema_name( | ||
"test", | ||
{"name": "test"}, | ||
{"schema": "default", "name": "prod"}, | ||
mock_env_var, | ||
exceptions.raise_compiler_error | ||
), | ||
"test" | ||
) %} | ||
{% endmacro %} | ||
|
||
{% macro test_generate_schema_name_raises_for_default_schema_name() %} | ||
{% do assert_equals( | ||
"test_generate_schema_name_raises_for_default_schema_name", | ||
_generate_schema_name( | ||
None, | ||
{"name": "test"}, | ||
{"schema": "default", "name": "prod"}, | ||
mock_env_var, | ||
mock_raise_compiler_error | ||
), | ||
"Compiler error raised" | ||
) %} | ||
{% endmacro %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
{% macro test_slugify() %} | ||
{% do test_kebab_slugify() %} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify() %} | ||
{% do test_kebab_slugify_lowercases_strings() %} | ||
{% do test_kebab_slugify_replaces_spaces() %} | ||
{% do test_kebab_slugify_replaces_slashes() %} | ||
{% do test_kebab_slugify_replaces_underscores() %} | ||
{% do test_kebab_slugify_removes_special_characters() %} | ||
{% do test_kebab_slugify_handles_leading_numbers() %} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify_lowercases_strings() %} | ||
{{ assert_equals( | ||
"test_kebab_slugify_lowercases_strings", | ||
kebab_slugify("TEST"), | ||
"test" | ||
) }} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify_replaces_spaces() %} | ||
{{ assert_equals( | ||
"test_kebab_slugify_replaces_spaces", | ||
kebab_slugify("t e s t"), | ||
"t-e-s-t" | ||
) }} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify_replaces_slashes() %} | ||
{{ assert_equals( | ||
"test_kebab_slugify_replaces_slashes", | ||
kebab_slugify("t/e/s/t"), | ||
"t-e-s-t" | ||
) }} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify_replaces_underscores() %} | ||
{{ assert_equals( | ||
"test_kebab_slugify_replaces_underscores", | ||
kebab_slugify("t_e_s_t"), | ||
"t-e-s-t" | ||
) }} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify_removes_special_characters() %} | ||
{{ assert_equals( | ||
"test_kebab_slugify_removes_special_characters", | ||
kebab_slugify("t!@#e$%^s&*()t"), | ||
"test" | ||
) }} | ||
{% endmacro %} | ||
|
||
{% macro test_kebab_slugify_handles_leading_numbers() %} | ||
{{ assert_equals( | ||
"test_kebab_slugify_handles_leading_numbers", | ||
kebab_slugify("123test"), | ||
"_123test" | ||
) }} | ||
{% endmacro %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
{% macro assert_equals(test_name, value, expected) %} | ||
{% if value == expected %} | ||
{% do log(test_name ~ " - PASS", info=True) %} | ||
{% else %} | ||
{% do exceptions.raise_compiler_error( | ||
test_name ~ " - FAIL: " ~ value~ " != " ~ expected | ||
) %} | ||
{% endif %} | ||
{% endmacro %} |
dfsnow marked this conversation as resolved.
Show resolved
Hide resolved
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,3 +11,22 @@ athena: | |
# "database" here corresponds to a Glue data catalog | ||
database: awsdatacatalog | ||
threads: 5 | ||
ci: | ||
type: athena | ||
s3_staging_dir: s3://ccao-dbt-athena-ci-us-east-1/results/ | ||
s3_data_dir: s3://ccao-dbt-athena-ci-us-east-1/data/ | ||
region_name: us-east-1 | ||
schema: dbt-test | ||
database: awsdatacatalog | ||
# Prefix all generated data by schema, so that we can delete it when the | ||
# PR is merged | ||
s3_data_naming: schema_table | ||
Comment on lines
+21
to
+23
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am actually not totally sure if this works yet, since we're not yet using dbt to manage our CTAs, but the idea here is that eventually when dbt builds CTAs and stores them in S3 the |
||
threads: 5 | ||
prod: | ||
type: athena | ||
s3_staging_dir: s3://ccao-athena-results-us-east-1/ | ||
s3_data_dir: s3://ccao-athena-data-us-east-1/ | ||
region_name: us-east-1 | ||
schema: default | ||
database: awsdatacatalog | ||
threads: 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question (blocking): How does this work when the name of a branch includes an unexpected character (i.e. the
/
injeancochrane/28-data-catalog-add-production-profile-to-the-dbt-configuration
)?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! I think we can replace special characters (and uppercase letters, which are not allowed in
dbt-athena
) using Jinja template filters. I'll take a stab at that now.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dfsnow This spiraled a little bit, but I was able to make it work in 181e516! It required a new custom macro, but that led to me realize that we were starting to build a lot of logic into a collection of untested macros, so I took a minute to design a quick unit testing framework for our macros. Curious what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Nice work and definitely needed since this is getting more complex. I'm a dumb dumb and commented on a commit instead of the PR. Linking here: 181e516#r123228596