Skip to content

KamauGilbert/top_100_uk_youtubers_2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 

Repository files navigation

📊 YouTube Influencer Marketing Project

This Data Analytics project assists marketers in identifying the most suitable YouTuber for promoting product sales, leveraging data-driven insights to maximize marketing effectiveness. 🚀

📁 Dataset

🔗 Check out the Kaggle dataset for a comprehensive collection of data sourced from various social media platforms as of the first half of 2024.

📈 Approach

1. Viewing and Preparing the Dataset

2. Data Extraction

3. Loading Data into SQL Server

  • Database: Created a database called youtube_db.
  • Table: Loaded the youtube_data_from_python.csv file into SQL Server and renamed the table to top_uk_youtubers_2024.
  • Data Cleaning: Created the view view_uk_youtubers_2024 with the following steps:
/*
# Data Cleaning Steps:
1. Remove unnecessary columns.
2. Extract the YouTube channel names.
3. Rename the column names.
*/

-- Select relevant columns
SELECT
    NOMBRE,
    total_subscribers,
    total_views,
    total_videos
FROM top_uk_youtubers_2024;

-- Extract channel names using CHARINDEX function
SELECT CHARINDEX('@', NOMBRE), NOMBRE FROM top_uk_youtubers_2024;

-- Using SUBSTRING function
SELECT 
    CAST(SUBSTRING(NOMBRE, 1, CHARINDEX('@', NOMBRE) - 1) AS VARCHAR(100)) AS channel_name,
    total_subscribers,
    total_views,
    total_videos
FROM top_uk_youtubers_2024;

-- Create view with relevant data
CREATE VIEW view_uk_youtubers_2024 AS
SELECT 
    CAST(SUBSTRING(NOMBRE, 1, CHARINDEX('@', NOMBRE) - 1) AS VARCHAR(100)) AS channel_name,
    total_subscribers,
    total_views,
    total_videos
FROM top_uk_youtubers_2024;

4. Data Quality Checks

/*
# Data Quality Tests:

1. Row count: Ensure there are 100 records.
2. Column count: Ensure there are 4 fields.
3. Data type checks: Validate data types.
4. Duplicate count check: Ensure records are unique.
*/

-- Row Count Test
SELECT COUNT(*) AS number_of_rows FROM view_uk_youtubers_2024;

-- Column Count Test
SELECT COUNT(*) AS column_count FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'view_uk_youtubers_2024';

-- Data Type Check
SELECT
    COLUMN_NAME,
    DATA_TYPE
FROM
    INFORMATION_SCHEMA.COLUMNS
WHERE
    TABLE_NAME = 'view_uk_youtubers_2024';

-- Duplicate Count Check
SELECT
    channel_name,
    COUNT(*) AS duplicate_count
FROM 
    view_uk_youtubers_2024
GROUP BY 
    channel_name
HAVING 
    COUNT(*) > 1;

5. PowerBI File: Connected youtube_db in SQL SSMS to PowerBI to create an interactive dashboard.

PowerBI File: Connected youtube_db in SQL SSMS to PowerBI to create an interactive dashboard.

PowerBI Dashboard

6. Analysis and Recommendations

Analysis File: Conducted SQL and Excel analysis and provided recommendations. YouTubers Analysis

🚀 Future Steps

  1. Conduct an analysis of other social media platforms to offer a more comprehensive recommendation.

  2. Provide more detailed recommendations after further analysis.

Releases

No releases published

Packages

No packages published

Languages