Welcome to Chinook Queries, a project where I step into the role of a data analyst for the Chinook online music store. This project highlights my SQL querying prowess, uncovering actionable insights from a rich and diverse music database. Through the use of SQL, Python, and data visualization techniques, I answered critical business questions and visualized key trends to support strategic decision-making.
- SQLite3: A lightweight and versatile database solution for querying and managing data.
- Python: For connecting to the database, executing queries, and performing data transformations.
- SQL: The foundation for uncovering trends and insights in the dataset.
- Plotly Express: For creating interactive and visually appealing data visualizations.
Objective: Calculate the total revenue generated by the Chinook store.
SELECT SUM(Total) FROM Invoice;
Result: $TOTAL_REVENUE
Objective: Identify the countries generating the most revenue, ranked in descending order.
SELECT BillingCountry, SUM(Total) AS Revenue
FROM Invoice
GROUP BY BillingCountry
ORDER BY Revenue DESC;
Top 5 Results:
Country | Revenue |
---|---|
COUNTRY_1 | $REVENUE_1 |
COUNTRY_2 | $REVENUE_2 |
COUNTRY_3 | $REVENUE_3 |
COUNTRY_4 | $REVENUE_4 |
COUNTRY_5 | $REVENUE_5 |
Objective: Discover the top 5 best-selling albums in the store.
SELECT Artist.Name, Album.Title, SUM(InvoiceLine.Quantity) AS quantity_sold
FROM InvoiceLine
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
JOIN Album ON Track.AlbumId = Album.AlbumId
JOIN Artist ON Album.ArtistId = Artist.ArtistId
GROUP BY Album.Title
ORDER BY quantity_sold DESC
LIMIT 5;
Results:
Artist Name | Album Title | Quantity Sold |
---|---|---|
ARTIST_1 | ALBUM_1 | QTY_1 |
ARTIST_2 | ALBUM_2 | QTY_2 |
ARTIST_3 | ALBUM_3 | QTY_3 |
ARTIST_4 | ALBUM_4 | QTY_4 |
ARTIST_5 | ALBUM_5 | QTY_5 |
Objective: Highlight the top-performing artists based on the quantity of tracks sold.
SELECT Artist.Name, SUM(InvoiceLine.Quantity) AS amount_sold
FROM InvoiceLine
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
JOIN Album ON Track.AlbumId = Album.AlbumId
JOIN Artist ON Album.ArtistId = Artist.ArtistId
GROUP BY Artist.Name
ORDER BY amount_sold DESC
LIMIT 5;
Results:
Artist Name | Amount Sold |
---|---|
ARTIST_1 | QTY_1 |
ARTIST_2 | QTY_2 |
ARTIST_3 | QTY_3 |
ARTIST_4 | QTY_4 |
ARTIST_5 | QTY_5 |
Objective: Identify the individual tracks that have sold the most copies.
SELECT Track.Name, SUM(InvoiceLine.Quantity) AS amount_sold
FROM InvoiceLine
JOIN Track ON InvoiceLine.TrackId = Track.TrackId
GROUP BY Track.Name
ORDER BY amount_sold DESC
LIMIT 5;
Results:
Track Name | Amount Sold |
---|---|
TRACK_1 | QTY_1 |
TRACK_2 | QTY_2 |
TRACK_3 | QTY_3 |
TRACK_4 | QTY_4 |
TRACK_5 | QTY_5 |
-
Clone the Repository:
git clone https://github.com/your-repo/Chinook-Queries cd Chinook-Queries
-
Install Dependencies:
- Python 3.x
- SQLite3
- Plotly Express
pip install -r requirements.txt
-
Run the Script:
python chinook_queries.py
-
Explore the Results: Visualize data trends and uncover actionable insights through interactive graphs and detailed tables.
- Hands-On SQL Practice: Explore advanced querying techniques to answer business-critical questions.
- Data Storytelling: Use data to create narratives that inform decision-making.
- Interactive Visualizations: Translate raw numbers into visually compelling insights.
- Real-World Application: Analyze a dataset modeled after real-world business scenarios.
Dive into the Chinook database, uncover hidden insights, and see data come to life. Fork the repo, run the queries, and unleash your inner data analyst. Let’s make music with data! 🎵