Skip to content

Commit

Permalink
PR review comments addressed
Browse files Browse the repository at this point in the history
  • Loading branch information
satish-chinthanippu committed Jul 16, 2024
1 parent 785f352 commit 42f9334
Showing 1 changed file with 11 additions and 51 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ pip install dbt-teradata dbt-core

== Create a database

NOTE: Use a database client like BTEQ, Lake Editor, Teradata Studio or dbeaver to execute below queries.
NOTE: Use a database client like BTEQ, Lake Editor, Teradata Studio or dbeaver to execute `CREATE DATABASE` query.

Let's create `jaffle_shop` database in Vantage lake instance with TD_OFSSTORAGE as default.

Expand All @@ -133,6 +133,8 @@ PERMANENT = 120e6, -- 120MB

== Create a database user

NOTE: Use a database client like BTEQ, Lake Editor, Teradata Studio or dbeaver to execute `CREATE USER` query.

Let's create `lake_user` user in Vantage lake instance.

[source, teradata-sql]
Expand All @@ -146,6 +148,8 @@ DEFAULT DATABASE = jaffle_shop;

== Grant access to user

NOTE: Use a database client like BTEQ, Lake Editor, Teradata Studio or dbeaver to execute `GRANT ACCESS` queries.

Let's provide required privileges to user `lake_user` to manage compute cluster.

[source, teradata-sql]
Expand Down Expand Up @@ -186,7 +190,7 @@ jaffle_shop:
target: dev
----

== Configure Airflow
== Configure Apache Airflow

1. Configure the listed environment variables to activate the test connection button, preventing the loading of sample DAGs and default connections in Airflow UI.
+
Expand All @@ -208,7 +212,7 @@ set dbt_venv_dir=/../../dbt_env/bin/activate
+
NOTE: Change `/../../` to path where virtual environment path defined.

== Start Airflow web server
== Start Apache Airflow web server
1. Run airflow web server
+
[source, bash]
Expand All @@ -217,7 +221,7 @@ airflow standalone
----
2. Access the airflow UI. Visit https://localhost:8080 in the browser and log in with the admin account details shown in the terminal.

== Define Airflow connection to Vantage Cloud Lake
== Define Apache Airflow connection to Vantage Cloud Lake

1. Click on Admin - Connections
2. Click on + to define new connection to Teradata vantage cloud lake instance.
Expand All @@ -230,7 +234,7 @@ airflow standalone
* Password (required): lake_user

== Define DAG in Apache Airflow
Dags in airflow will be defined as python file. Similarly, Define below dag to run dbt transformations defined in `jaffle_shop` dbt project using vantage cloud lake compute cluster. Copy below python code and save it as `airflow-vcl-compute-clusters-manage.py` under directory $AIRFLOW_HOME/files/dags.
Dags in airflow will be defined as python file. Similarly, define below dag to run dbt transformations defined in `jaffle_shop` dbt project using vantage cloud lake compute cluster. Copy below python code and save it as `airflow-vcl-compute-clusters-manage.py` under directory $AIRFLOW_HOME/files/dags.

[source, python]
----
Expand Down Expand Up @@ -322,53 +326,9 @@ with DAG(
>> remove_compute_group_from_user
)
----

== Configure Apache Airflow

1. Configure the listed environment variables to activate the test connection button, preventing the loading of sample DAGs and default connections in Airflow UI.
+
[source, bash]
export AIRFLOW__CORE__TEST_CONNECTION=Enabled
export AIRFLOW__CORE__LOAD_EXAMPLES=false
export AIRFLOW__CORE_LOAD_DEFAULT_CONNECTIONS=false

2. Define the path of jaffle_shop project as an environment variable `dbt_project_home_dir`.
+
[source, bash]
export dbt_project_home_dir=../../jaffle_shop
+
NOTE: Change `/../../` to path of jaffle_shop project path.

3. Define the virtual environment path where dbt-teradata installed in <<Install dbt>> as an environment variable `dbt_venv_dir`.
[source, bash]
set dbt_venv_dir=/../../dbt_env/bin/activate
+
NOTE: Change `/../../` to path where virtual environment path defined.

== Start Apache Airflow web server
1. Run airflow web server
+
[source, bash]
----
airflow standalone
----
5. Access the airflow UI. Visit https://localhost:8080 in the browser and log in with the admin account details shown in the terminal.

== Define Apache Airflow connection to Vantage Cloud Lake

1. Click on Admin - Connections
2. Click on + to define new connection to Teradata vantage cloud lake instance.
3. Define new connection with id `teradata_lake` with Teradata vantage cloud lake instance details.
* Connection Id: teradata_lake
* Connection Type: Teradata.
* Database Server URL (required): Teradata vantage cloud lake instance hostname to connect to.
* Database: jaffle_shop
* Login (required): lake_user
* Password (required): lake_user

== Load DAG

When the dag file copied to $AIRFLOW_HOME/files/dags, apache airflow loads the dag to airflow UI.
When the dag file is copied to $AIRFLOW_HOME/files/dags, apache airflow loads the dag to airflow UI.

== Run DAG

Expand All @@ -381,7 +341,7 @@ image::{dir}/airflow-dag-run.png[Run dag,align="left" width=75%]
In this quick start guide, we explored how to utilize Teradata Vantage Lake Compute Cluster to execute dbt transformations using Airflow compute cluster operators.

== Further reading
* link:https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html[Airflow DAGs reference]
* link:https://airflow.apache.org/docs/apache-airflow/stable/core-concepts/dags.html[Apache Airflow DAGs reference]
* link:https://airflow.apache.org/docs/apache-airflow-providers-teradata/stable/operators/index.html[Airflow Teradata Compute Cluster Operators]


0 comments on commit 42f9334

Please sign in to comment.