Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue in Databricks Asset Bundle custom templete #1933

Open
kshrikant7 opened this issue Nov 25, 2024 · 3 comments
Open

Issue in Databricks Asset Bundle custom templete #1933

kshrikant7 opened this issue Nov 25, 2024 · 3 comments
Assignees
Labels
DABs DABs related issues No Autoclose

Comments

@kshrikant7
Copy link

kshrikant7 commented Nov 25, 2024

Issue

I am getting an issue when I try to create workflow scripts using custom template. Below is the error I am getting

template: :89: function "tasks" not defined. or function "job" not defined

I am getting this issue when I assign either of the value as below

"{{ tasks.A_task_key.values.B_task_key }}"

or

"{{job.parameters.ABC_parameter}}"

I have tried all escape methods in GoLang but for above but none of them work. I am getting this error when I try to put them in databricks_template_schema.json too.

Configuration

Create a custom template with .tmpl file extension(Do follow the https://docs.databricks.com/en/dev-tools/bundles/custom-template.html)

Steps to reproduce the behavior

  1. Run databricks bundle init dab-container-template

Expected Behavior

When I run the it should access the values from .json file and assign those values to respective variables.

Actual Behavior

Whenever I try to run the above mentioned command I am getting the following

template: :89: function "tasks" not defined. or function "job" not defined

As I mentioned above I'm getting this error when I try to add that value using databricks_template_schema.json or when I use it in any abc_job.yml.tmpl

OS and CLI version

OS : Windows
Currently using Databricks CLI v0.235.0

Is this a regression?

I am getting this error in all the version, older and newer

Debug Logs

: template: :89: function "tasks" not defined
21:43:25 ERROR failed execution pid=17564 exit_code=1 error="failed to compute file content for resources/workflows/Silver_Scoring_Job.yml.tmpl. error in resources:\n jobs:\n Silver_Scoring_Job_{{.company_code}}:\n name: "Silver Scoring Job {{.company_code}}${var.workflow_env}"\n permissions:\n - level: ${var.can_view_level_permission}\n group_name: ${var.can_view_level_permission_group_name}\n - level: ${var.can_manage_run_level_permission}\n group_name: ${var.can_run_level_permission_group_name}\n - level: ${var.can_manage_run_level_permission}\n user_name: ${var.can_manage_level_permission_user_name}\n - level: ${var.can_manage_run_level_permission}\n service_principal_name: ${var.can_manage_level_permission_for_service_principal_name_1}\n tasks:\n - task_key: Final_Model_Selection\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/1 - Final Model Selection"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Data_Processing_Future_Weeks\n depends_on:\n - task_key: Final_Model_Selection\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/2 - Data Preparation for Future Weeks"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Missing_Value_Treatment\n depends_on:\n - task_key: Data_Processing_Future_Weeks\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/3 - Missing value treatment"\n
source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Croston\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.3. Scoring Croston"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: ElasticNet\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.2.1. Scoring ElasticNet"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Holt\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.2 Scoring Holt"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: SES\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.1 Scoring Simple Exponential Smoothing"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: SMA\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.1.4. Scoring Simple Moving Average"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: XGB\n depends_on:\n - task_key: Missing_Value_Treatment\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.2.2. Scoring XGB"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n libraries:\n - pypi:\n package: numpy==1.24.0\n - task_key: RUN_ENSEMBLING\n depends_on:\n - task_key: SES\n - task_key: ElasticNet\n - task_key: Holt\n - task_key: Croston\n - task_key: SMA\n - task_key: XGB\n condition_task:\n op: EQUAL_TO\n left: "{{ tasks.Final_Model_Selection.values.Run_Ensembling }}"\n right: "true"\n - task_key: Ensembling\n depends_on:\n - task_key: RUN_ENSEMBLING\n outcome: "true"\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/5.3. Ensembling - future forecasts"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n - task_key: Model_Results_Consolidation\n depends_on:\n - task_key: Ensembling\n - task_key: RUN_ENSEMBLING\n outcome: "false"\n run_if: AT_LEAST_ONE_SUCCESS\n notebook_task:\n notebook_path: "notebooks/4.Forecaster/4. Scoring/6 - Model Results Consolidation"\n source: ${var.code_source}\n job_cluster_key: ${var.silver_scoring_job_cluster_key}\n job_clusters:\n - job_cluster_key: ${var.silver_scoring_job_cluster_key}\n new_cluster: ${var.silver_scoring_job_cluster}\n git_source:\n git_url: ${var.git_url}\n git_provider: ${var.git_provider}\n git_branch: "${var.git_branch}"\n tags:\n env: ${var.tag_env}\n retailer: ${var.tag_retailer}\n queue:\n enabled: true\n parameters:\n - name: series_name\n default: ${var.silver_scoring_job_series_name}\n run_as:\n service_principal_name: ${var.run_as_service_principal_name}\n: template: :89: function "tasks" not defined"

@kshrikant7 kshrikant7 added the DABs DABs related issues label Nov 25, 2024
@shreyas-goenka
Copy link
Contributor

Hi @kshrikant7, thanks for reaching out! The supported syntax in the DABs templates is the same as that for Go text templates: https://pkg.go.dev/text/template

In these templates, you refer to variable values by having a . prefix. For any fields that you define in your databricks_template_schema.json file, the reference would look something like: {{ .project_name }}. In the example you shared, {{ .project_name }} is the only key in the databricks_template_schema.json file so only that can be interpolated.

You can refer to templates here as a reference for how the syntax works: https://github.com/databricks/cli/tree/main/libs/template/templates

Could you please share why you are trying to interpolate {{ tasks.A_task_key.values.B_task_key }}? I'm not sure the databricks bundle init` command is the right one for your usecase.

@shreyas-goenka shreyas-goenka self-assigned this Nov 25, 2024
@kshrikant7
Copy link
Author

kshrikant7 commented Nov 26, 2024

@shreyas-goenka

I have defined variables other then project_name in databricks_template_schema.json something like below

{
  "properties": {
    "bundle_name": {
      "type": "string",
      "default": "ABC",
      "description": "Bundle name",
      "order": 1
    },
    "project_name": {
      "type": "string",
      "default": "XYZ",
      "description": "Project name",
      "order": 2
    },
     "company_code": {
      "type": "string",
      "default": "ABC",
      "description": "Company Name Code",
      "order": 10
    },
    "workflow_dev_env": {
      "type": "string",
      "default": "DEV",
      "description": "Workflow environment DEV/PRD",
      "order": 10
    }
}

And I am passing the values to databricks.yml like below

variables:
     company_code:
        default: {{.company_code}}

    workflow_dev_env:
      default: {{.workflow_dev_env}}

And these values are accessed in actual workflows as we access variables

${var.variable_name}

I am able to access all the other variables in the above mentioned methods, but the issue raises only when the value contains "{{tasks.A_task_key.values.B_task_key}}"
or
"{{job.parameters.ABC_parameter}}"

Coming to why I am using it, here is the full workflow script

Workflow using {{job.parameters.ABC_parameter}}

resources:
  jobs:
    Alert_FTP_Ingestion_{{.company_code}}:
      name: "Alert FTP Ingestion {{.company_code}}${var.workflow_env}"
      permissions:
        - level: ${var.can_view_level_permission}
          group_name: ${var.can_view_level_permission_group_name}
        - level: ${var.can_manage_run_level_permission}
          group_name: ${var.can_run_level_permission_group_name}
        - level: ${var.can_manage_run_level_permission}
          user_name: ${var.can_manage_level_permission_user_name}
        - level: ${var.can_manage_run_level_permission}
          service_principal_name: ${var.can_manage_level_permission_for_service_principal_name_1}
      tasks:
        - task_key: skip_alert_ingestion_temp
          condition_task:
            op: EQUAL_TO
            left: "{{job.parameters.skip_alert_ingestion_temp}}"
            right: "true"
        - task_key: Export_FTP
          depends_on:
            - task_key: skip_alert_ingestion_temp
              outcome: "false"
          notebook_task:
            notebook_path: ""
            source: ${var.code_source}
          job_cluster_key: ${var.alert_FTP_ingestion_job_cluster_key}
      job_clusters:
        - job_cluster_key: ${var.alert_FTP_ingestion_job_cluster_key}
          new_cluster: ${var.alert_ftp_generation_job_cluster}
      git_source:
        git_url: ${var.git_url}
        git_provider: ${var.git_provider}
        git_branch: ${var.git_branch}
      tags:
        env: ${var.tag_env}
        retailer: ${var.tag_retailer}
      parameters:
        - name: skip_alert_ingestion_temp
          default: "${var.alert_FTP_ingestion_job_skip_alert_ingestion_temp}"
      run_as:
        service_principal_name: ${var.run_as_service_principal_name}

Workflow using {{tasks.A_task_key.values.B_task_key}}

resources:
  jobs:
    {{.project_name}}_{{.company_code}}:
      name: "{{.project_name}} {{.company_code}}${var.workflow_env}"
      email_notifications:
        on_failure:
          - ${var.on_failure_email_notification}
      schedule:
        quartz_cron_expression: ${var.schedule_quartz_cron_expression}
        timezone_id: ${var.schedule_timezone_id}
        pause_status: ${var.schedule_pause_status}    
      tasks:
        - task_key: Data_Ingestion_Job
          run_job_task:
            job_id: ${resources.jobs.Data_Ingestion_{{.company_code}}.id} 
        - task_key: Trigger_Pipelines
          depends_on:
            - task_key: Data_Ingestion_Job
          notebook_task:
            notebook_path: "notebooks/1.Data_Ingestion/NAUSWALGREEN/Trigger"
            source: ${var.code_source}
          job_cluster_key: ${var.trigger_pipeline_job_cluster_key}
          max_retries: 3
          min_retry_interval_millis: 600000
        - task_key: run_alert_ftp_ingestion
          depends_on:
            - task_key: Trigger_Pipelines
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Run_Export_FTP }}"
            right: "true"
        - task_key: Export_FTP_without_Alert
          depends_on:
            - task_key: run_alert_ftp_ingestion
              outcome: "true"
          run_job_task:
            job_id: ${resources.jobs.Alert_FTP_Ingestion_{{.company_code}}.id}
        - task_key: run_alerts
          depends_on:
            - task_key: Trigger_Pipelines
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Run_Broker_Pipeline }}"
            right: "true   
        - task_key: Alert_Generation_Job
          depends_on:
            - task_key: run_alerts
              outcome: "true" 
          run_job_task:
            job_id: ${resources.jobs.Alert_Generation_{{.company_code}}.id}
        - task_key: Alert_FTP_Ingestion
          depends_on:
            - task_key: Alert_Generation_Job
          run_job_task:
            job_id: ${resources.jobs.Alert_FTP_Ingestion_{{.company_code}}.id}
        - task_key: Refit_Run
          depends_on:
            - task_key: Alert_FTP_Ingestion
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Refit }}"
            right: "true"
        - task_key: Refit_Job
          depends_on:
            - task_key: Refit_Run
              outcome: "true"
          run_job_task:
            job_id: ${resources.jobs.Refit_{{.company_code}}.id}
        - task_key: Retrain_Run
          depends_on:
            - task_key: Alert_FTP_Ingestion
          condition_task:
            op: EQUAL_TO
            left: "{{ tasks.Trigger_Pipelines.values.Retrain }}"
            right: "true"
        - task_key: Retrain_Job
          depends_on:
            - task_key: Retrain_Run
              outcome: "true"
          run_job_task:
            job_id: ${resources.jobs.Retrain_{{.company_code}}.id}
      job_clusters:
        - job_cluster_key: ${var.trigger_pipeline_job_cluster_key}
          new_cluster: ${var.trigger_pipeline_job_cluster}
      git_source:
        git_url: ${var.git_url}
        git_provider: ${var.git_provider}
        git_branch: "${var.git_branch}"
      tags:
        env: ${var.tag_env}
        retailer: ${var.tag_retailer}
      queue:
        enabled: true
      run_as:
        service_principal_name: ${var.run_as_service_principal_name}

Copy link

github-actions bot commented Jan 2, 2025

This issue has not received a response in a while. If you want to keep this issue open, please leave a comment below and auto-close will be canceled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DABs DABs related issues No Autoclose
Projects
None yet
Development

No branches or pull requests

3 participants