GitHub - dmatsanganis/Twitter_Mention_Graph_Analysis_A_Five_Day_Case_Study: This repository contains a comprehensive analysis of a Twitter mention dataset from July 2009. The goal of this case study is to analyze the mention relationships between users, identify the most important topics for each user based on hashtags, and perform various analyses on the graph representation of the data.

Twitter Mention Graph Analysis: A Five-Day Case Study

This repository contains a comprehensive analysis of a Twitter mention dataset from July 2009. The goal of this case study is to analyze the mention relationships between users, identify the most important topics for each user based on hashtags, and perform various analyses on the graph representation of the data.

Dataset

The dataset used in this case study consists of tweets from July 2009. Each tweet includes information about the time of posting, user handles, and the text of the tweets. The data has been manipulated and organized into five CSV files, each representing the weighted directed mention graph for a specific day. These CSV files contain information about the users involved in the mentions, the frequency of mentions between users, and the most important topic (hashtag) for each user.

Analysis Steps

The analysis conducted in this case study follows a systematic approach. For a more detailed approach see the provided documentation:

Data Manipulation: The raw data is processed using Python to create the five CSV files representing the mention graph for each day through Python.
Graph Creation: R is used to create igraph graphs based on the CSV files. The graph vertices are updated to include the topic attribute for each user, enabling further analysis and visualization.
Metric Evolution: The evolution of different metrics over the five-day period is examined. Plots are created to visualize changes in the number of vertices, number of edges, graph diameter, average in-degree, and average out-degree. Significant fluctuations in these metrics are identified and discussed.
Top User Analysis: Data frames are generated for each day, highlighting the top-10 Twitter users based on in-degree, out-degree, and PageRank. Variations in the top-10 lists are observed, indicating changes in user influence and popularity.
Community Detection: Community detection algorithms, including fast greedy clustering, infomap clustering, and Louvain clustering, are applied to the undirected versions of the mention graphs. The performance of these algorithms is evaluated, and insights are provided on their effectiveness.
Community Evolution: A specific user present in all five graphs is chosen, and their community evolution is analyzed. Similarities in the communities the user belongs to are identified, along with the most important topics of interest. Shared topics among communities are explored, and a visualization of the graph is created, using different colors to represent each community. Nodes belonging to very small or large communities are filtered out to improve clarity and aesthetics.

Contributors

Dimitris Matsanganis

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
Created Diagrams		Created Diagrams
Exported Datasets		Exported Datasets
Case Study Documentation.pdf		Case Study Documentation.pdf
Case Study.pdf		Case Study.pdf
LICENSE		LICENSE
README.md		README.md
analysis.R		analysis.R
raw_data_handler.py		raw_data_handler.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter Mention Graph Analysis: A Five-Day Case Study

Dataset

Analysis Steps

Contributors

About

Releases

Packages

Languages

License

dmatsanganis/Twitter_Mention_Graph_Analysis_A_Five_Day_Case_Study

Folders and files

Latest commit

History

Repository files navigation

Twitter Mention Graph Analysis: A Five-Day Case Study

Dataset

Analysis Steps

Contributors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages