Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dbt run --select=package:dbt_project_evaluator fails #480

Closed
1 of 6 tasks
nickagel opened this issue Aug 16, 2024 · 12 comments
Closed
1 of 6 tasks

dbt run --select=package:dbt_project_evaluator fails #480

nickagel opened this issue Aug 16, 2024 · 12 comments
Labels
bug Something isn't working triage

Comments

@nickagel
Copy link

nickagel commented Aug 16, 2024

Describe the bug

dbt run --select=package:dbt_project_evaluator fails. I'm using redshift and getting a
Database Error in model stg_nodes (models/staging/graph/stg_nodes.sql)
value too long for type character varying(600)
compiled Code at target/run/dbt_project_evaluator/models/staging/graph/stg_nodes.sql

dispatch:
  - macro_namespace: dbt_utils
    search_order: ["dbt_utils"]
  - macro_namespace: dbt
    search_order: ['dbt_project_evaluator', 'dbt']
packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
  - package: dbt-labs/codegen
    version: 0.12.1
  - package: dbt-labs/dbt_project_evaluator
    version: 0.12.2

Steps to reproduce

dbt run --select=package:dbt_project_evaluator

Expected results

successful run

Actual results

fail on creating stg_nodes table

Screenshots and log output

Screenshot 2024-08-16 at 1 22 16 PM

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
  - package: dbt-labs/codegen
    version: 0.12.1
  - package: dbt-labs/dbt_project_evaluator
    version: 0.12.2

Which database are you using dbt with?

  • postgres
  • redshift
  • bigquery
  • snowflake
  • trino/starburst
  • other (specify: ____________)

The output of dbt --version:

Core:
  - installed: 1.7.15
  - latest:    1.8.5  - Update available!

  Your version of dbt-core is out of date!
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation

Plugins:
  - redshift: 1.7.2  - Update available!
  - postgres: 1.7.15 - Update available!

  At least one plugin is out of date or incompatible with dbt-core.
  You can find instructions for upgrading here:
  https://docs.getdbt.com/docs/installation
@nickagel nickagel added bug Something isn't working triage labels Aug 16, 2024
@b-per
Copy link
Collaborator

b-per commented Aug 19, 2024

Hi!

Would you be able to check your logs/dbt.log file and see the query that failed and which field had more than 600 chars?
As we load data with post hooks the target/run and target/compiled folders won't show you the post hook query.

@b-per
Copy link
Collaborator

b-per commented Aug 20, 2024

If you install 0.12.3 you could also try to add this macro to your project and try with different values for the size of the string

{%- macro redshift__type_string_dpe() -%}
    {{ return(api.Column.string_type(600)) }}
{%- endmacro -%}

@nickagel
Copy link
Author

Compiled code tried with / without macro

/* Bigquery won't let us `where` without `from` so we use this workaround */
with dummy_cte as (
    select 1 as foo
)

select 

    cast(null as character varying(600)) as unique_id,
    cast(null as character varying(600)) as name,
    cast(null as character varying(600)) as resource_type,
    cast(null as character varying(600)) as file_path,
    cast(True as boolean) as is_enabled,
    cast(null as character varying(600)) as materialized,
    cast(null as character varying(600)) as on_schema_change,
    cast(null as character varying(600)) as model_group,
    cast(null as character varying(600)) as access,
    cast(null as character varying(600)) as latest_version,
    cast(null as character varying(600)) as version,
    cast(null as character varying(600)) as deprecation_date,
    cast(True as boolean) as is_contract_enforced,
    cast(0 as integer) as total_defined_columns,
    cast(0 as integer) as total_described_columns,
    cast(null as character varying(600)) as database,
    cast(null as character varying(600)) as schema,
    cast(null as character varying(600)) as package_name,
    cast(null as character varying(600)) as alias,
    cast(True as boolean) as is_described,
    cast(null as character varying(600)) as column_name,
    cast(null as character varying(600)) as meta,
    cast(null as character varying(600)) as hard_coded_references,
    cast(null as integer) as number_lines,
    cast(null as float) as sql_complexity,
    cast(null as character varying(600)) as macro_dependencies,
    cast(True as boolean) as is_generic_test,
    cast(True as boolean) as is_excluded

from dummy_cte
where false

@nickagel
Copy link
Author

Same issue with upgrading to 13.1

@b-per
Copy link
Collaborator

b-per commented Aug 26, 2024

Thanks. As mentioned, we'd need the logs from logs/dbt.log , not just the compiled code, as the issue is likely with the hooks we use to load the data.

@nickagel
Copy link
Author

Due to the potential sensitivity of the logs I've uploaded it to a repository & granted you access https://github.com/nickagel/logs

@nickagel
Copy link
Author

I also upgraded to 1.8 to see if that helped. It's might be the combination of my table names + prefixes added by the evaluator is far too large & may need to be sliced at the 600 char limit

@b-per
Copy link
Collaborator

b-per commented Sep 2, 2024

Thanks for sharing the logs.

I looked at it and the issue is that you have some accepted_values test with a lot of accepted values, which generates test ids and names of more than 800 characters.

I think that the problem comes from 2 tests on the same model that starts with dim_pricespec.

Suggestions:

  • what happens if you add a name config to those tests (here are the docs) to provide those a human friendly shorter name?
  • with that many different values allowed, I would also recommend adding the accepted values to some seed and then adding a relationship test between the model and the seed.

Let us know if you can managed testing those options and if it fixes the problem.

@nickagel
Copy link
Author

nickagel commented Sep 2, 2024

Both of those options worked for me. Thank you 😄

Just curious if there is a plan to auto resolve extra long names for tests via the evaluator?

@b-per
Copy link
Collaborator

b-per commented Sep 2, 2024

We could technically truncate some fields to suppress the error, but I am quite reluctant to put that in place because the issue you face should only be faced by:

  • people using Redshfit (other DWs don't have a limit on the number of chars)
  • and that have some accepted_values test with many values

I feel that getting the test id/name shorter or using the relationships test workaround like I suggested before is actually a good thing for your project overall.

@b-per b-per closed this as completed Sep 2, 2024
@nickagel
Copy link
Author

nickagel commented Sep 2, 2024

Fair enough, maybe an error message would help for the future then for those using redshift & who don't come across this thread 🤷🏻

@aleduc00
Copy link

I faced this issue too. This proposed solution solved the issue for me:

{%- macro redshift__type_string_dpe() -%}
    {{ return(api.Column.string_type(600)) }}
{%- endmacro -%}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

No branches or pull requests

3 participants