This repository contains a university project about text analysis and processing of Reddit posts containing the words "climate" and "change" in the years 2010 to 2022.
Supervision: Prof. Dr. Jan Fabian Ehmke
Group members: Britz Luis, Huber Anja, Krause Felix Elias, Preda Yvonne-Nadine
University of Vienna, summer term 2023
Download data from kaggle and place in a folder "data" in working directory.
- EDA and text analysis of dataset shared above
- Pre-processing of data
- Year-wise topic detection using BERT
- Sentiment and emotion detection using pre-trained HuggingFace transformers
- Visualizations of general results
- Two files for creating a final plot for high-level grouped topics
Also check out the "poster_reddit_climate_change" file for an overview of the whole project.