ITA-ELECTION-2022: A multi-platform dataset of social media conversations around the 2022 Italian general election

This is a repository of social media posts related to the Italian 2022 general election. For more details about the collection procedure see the related dataset paper:

Francesco Pierri, Geng Liu, and Stefano Ceri. 2023. ITA-ELECTION-2022: A Multi-Platform Dataset of Social Media Conversations Around the 2022 Italian General Election. In Proceedings of the 32nd ACM International Conference on Information and Knowledge Management (CIKM '23). Association for Computing Machinery, New York, NY, USA, 5386–5390. https://doi.org/10.1145/3583780.3615121

Bibtex:

@inproceedings{10.1145/3583780.3615121, author = {Pierri, Francesco and Liu, Geng and Ceri, Stefano}, title = {ITA-ELECTION-2022: A Multi-Platform Dataset of Social Media Conversations Around the 2022 Italian General Election}, year = {2023}, isbn = {9798400701245}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, url = {https://doi.org/10.1145/3583780.3615121}, doi = {10.1145/3583780.3615121}, abstract = {Online social media play a major role in shaping public discourse and opinion, especially during political events. We present the first public multi-platform dataset of Italian-language political conversations, focused on the 2022 Italian general election taking place on September 25th. Leveraging public APIs and a keyword-based search, we collected millions of posts published by users, pages and groups on Facebook, Instagram and Twitter, along with metadata of TikTok and YouTube videos shared on these platforms, over a period of four months. We augmented the dataset with a collection of political ads sponsored on Meta platforms, and a list of social media handles associated with political representatives. Our data resource will allow researchers and academics to further our understanding of the role of social media in the democratic process.}, booktitle = {Proceedings of the 32nd ACM International Conference on Information and Knowledge Management}, pages = {5386–5390}, numpages = {5}, keywords = {multi-platform, italy, social media, politics, advertisement}, location = {Birmingham, United Kingdom}, series = {CIKM '23} }

If you use this data please don't forget to cite the paper above, thanks!

Contact: francesco.pierri at polimi.it

Files description

fb_ids_urls and ig_ids_urls contain daily ".csv" files for Facebook and Instagram data, respectively. Rows provide the platformId and postUrl of each post, which can be used to retrieve the original post via Crowdtangle or access it in the browser directly (for more details see Crowdtangle documentation)

tw_ids contains daily ".txt" files for Twitter data. Each file provides IDs of tweets that can be used to re-hydrate tweets using Hydrator or Twarc.

keywords.txt contains the list of election-related keywords employed for collecting the data on different platforms. These were obtained through a snowball sampling approach starting with seeds such as "elezioni" or "elezioni2022".

meta_ads_ids.txt contains the list of IDs of ads collected in the dataset. These can be used to retrieve ads through Meta Ad Library API or search console. A repository containing screenshots of these ads is available here

*_metadata.csv files contain metadata for YouTube and TikTok videos shared on Twitter and Facebook. YouTube metadata was collected through the official API whereas TikTok metadata was scraped using pyktok library

*_representative_handles.csv files contain social media handles associated to Italian representatives.

Twitter data ##

Given future limitations that will be likely introduced to public APIs, we encourage interested researchers to reaching out to us in case they find difficulties in accessing Twitter data.

Acknowledgments

We are thankful to M.Sc. students Valeria Panté and Ilaria Saini for helping match social media accounts to political representatives. Work supported in part by PRIN grant HOPE (FP6, Italian Ministry of Education).

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
fb_ids_urls		fb_ids_urls
ig_ids_urls		ig_ids_urls
tw_ids		tw_ids
Facebook_Tiktok_metadata.csv		Facebook_Tiktok_metadata.csv
Facebook_YouTube_metadata.csv		Facebook_YouTube_metadata.csv
README.md		README.md
Twitter_Tiktok_metadata.csv		Twitter_Tiktok_metadata.csv
Twitter_YouTube_metadata.csv		Twitter_YouTube_metadata.csv
fb_ig_representatives_handles.csv		fb_ig_representatives_handles.csv
keywords.txt		keywords.txt
meta_ads_ids.txt		meta_ads_ids.txt
twitter_representatives_handles.csv		twitter_representatives_handles.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ITA-ELECTION-2022: A multi-platform dataset of social media conversations around the 2022 Italian general election

Files description

Twitter data ##

Acknowledgments

About

Releases

Packages

frapierri/ita-election-2022

Folders and files

Latest commit

History

Repository files navigation

ITA-ELECTION-2022: A multi-platform dataset of social media conversations around the 2022 Italian general election

Files description

Twitter data ##

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages