Skip to content

Udacity's Data Engineering Nanodegree (SQL, PySpark, Airflow, Amazon S3, Redshift)

Notifications You must be signed in to change notification settings

Leoputera2407/Udacity_DataEng

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Udacity_DataEng

Completed the Udacity Data Engineering Nanodegree. It also teaches about how to use Sql to store and manipulate one's data. It also teach modern Cloud Datalake solutions, like Amazon S3 and Redshift. Moreover, it teaches on how to perform the basic ETL operations that any data engineer need to know. Some ETL tools taught are PySpark and also Airflow, which can automate ETL.

Technology learnt

  1. PostgresSQL
  2. PySpark
  3. Apache Airflow
  4. Amazon S3
  5. Amazon Redshift

Projects Included

  1. Project 3 -- where we worked with Amazon S3 and load them to Amazon Redshift for staging)
  2. Project 4 -- where we performed Data Quality Checks using PySpark
  3. Project 5 -- where we implemented Airflow to automate Data Quality Checking and Extraction

About

Udacity's Data Engineering Nanodegree (SQL, PySpark, Airflow, Amazon S3, Redshift)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published