The issue of data scraping for Twitter 15 and Twitter 16 datasets #34

wwt12138 · 2025-01-24T13:55:19Z

您好，在数据集爬取方面我想请教几个问题，希望能够得到您的解答：
1.您之前提到，对社交网络中每个用户爬取2000个粉丝，以此构建他们之间的关系，但是我在爬取过程中发现，twitter15与16数据集中，有部分用户貌似已经注销无法爬取信息，请问这种情况您是如何处理的？
2.对于198个事件，每一个事件都确定了参与用户的情况下，举个例子，在a事件中有140人参与了事件传播，是通过对这140个参与用户都爬取2000个粉丝然后构建社交关系吗？

Translate to English:

Hello, I would like to ask you a few questions regarding dataset crawling and hope to receive your guidance:

You previously mentioned crawling 2,000 followers for each user on social networks to construct their relationships. However, during my crawling process, I noticed that in the Twitter15 and Twitter16 datasets, some users seem to have deactivated their accounts, making it impossible to retrieve their information. How did you handle such situations?

For the 198 events, once the participating users for each event have been identified, for example, if 140 people participated in event A, is the social relationship constructed by crawling 2,000 followers for each of these 140 participating users?

Redtides0 · 2025-01-27T05:50:55Z

You previously mentioned crawling 2,000 followers for each user on social networks to construct their relationships. However, during my crawling process, I noticed that in the Twitter15 and Twitter16 datasets, some users seem to have deactivated their accounts, making it impossible to retrieve their information. How did you handle such situations?

Indeed, such occurrences are possible, but they constitute a minority in the data I compile; therefore, I shall disregard them.

For the 198 events, once the participating users for each event have been identified, for example, if 140 people participated in event A, is the social relationship constructed by crawling 2,000 followers for each of these 140 participating users?

The main idea is correctly understood, but it pertains to 'following' rather than 'followers'. due to cost constraints, you may choose between them based on their respective quantities.

lightaime changed the title ~~推特15与推特16数据集爬取问题~~ The issue of data scraping for Twitter 15 and Twitter 16 datasets Jan 25, 2025

zhangzaibin assigned Redtides0 Jan 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The issue of data scraping for Twitter 15 and Twitter 16 datasets #34

The issue of data scraping for Twitter 15 and Twitter 16 datasets #34

wwt12138 commented Jan 24, 2025 •

edited by lightaime

Loading

Redtides0 commented Jan 27, 2025

The issue of data scraping for Twitter 15 and Twitter 16 datasets #34

The issue of data scraping for Twitter 15 and Twitter 16 datasets #34

Comments

wwt12138 commented Jan 24, 2025 • edited by lightaime Loading

Redtides0 commented Jan 27, 2025

wwt12138 commented Jan 24, 2025 •

edited by lightaime

Loading