Skip to content

f-krause/reddit_nlp_project

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

43 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Reddit climate change project

Sad Snoo

About

This repository contains a university project about text analysis and processing of Reddit posts containing the words "climate" and "change" in the years 2010 to 2022.

Supervision: Prof. Dr. Jan Fabian Ehmke

Group members: Britz Luis, Huber Anja, Krause Felix Elias, Preda Yvonne-Nadine

University of Vienna, summer term 2023

Download data from kaggle and place in a folder "data" in working directory.


Files Content

  1. EDA and text analysis of dataset shared above
  2. Pre-processing of data
  3. Year-wise topic detection using BERT
  4. Sentiment and emotion detection using pre-trained HuggingFace transformers
  5. Visualizations of general results
  6. Two files for creating a final plot for high-level grouped topics

Also check out the "poster_reddit_climate_change" file for an overview of the whole project.


Some Final Results

Emotion of Comments over the Years

emotion over time


Topic Frequencies over the Months in 2019

Topics over the years


Share of Comments grouped to High-Level Categories Over Time

Groups over the years



BERT meme

About

Repository of doing data science class project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.8%
  • Python 0.2%