Skip to content

Latest commit

 

History

History

airflow-docker

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

ETL pipeline with Airflow

This repository includes implementation of an ETL pipeline with Airflow and Docker. The pipelines are used to automatize the process of extracting data from various sources, transforming them, and loading the transformed data into a destination.

Links: Medium article

Requirements

Install and run Docker.

Usage

Run the following to start microservices including Airflow and PostgreSQL:

docker compose up -d

Then, go to http://localhost:8080/ to access the Airflow UI.

airflow

DAGs

The following DAGs are included in this repository:

  • etl_pipeline: downlaods a publicly available CSV file from stats.govt.nz, transforms it by selecting a few features of the dataset, and loads the transformed data into a PostgreSQL database.