-
Notifications
You must be signed in to change notification settings - Fork 0
Data fields from streamer
Pasan Kamburugamuwa edited this page Sep 19, 2023
·
7 revisions
With mastodonpy here streaming, you can collect data with the fields below (Here is the link to the full documentation here).
- id: ID of the status in the database.
- uri: URI of the status used for federation
- created_at: The date when this status was created
- account: The account that authored this status.
- account_id: The account id that authored this status.
- content: HTML-encoded status content.
- visibility: Toot visibility ('public', 'unlisted', 'private', or 'direct')
- sensitive: Is this status marked as sensitive content?
- spoiler_text: Subject or summary line, below which status content is collapsed until expanded.
- media_attachments: Media that is attached to this status.
- mentions: Mentions of users within the status content.
- tags: Hashtags used within the status content.
- emojis: Custom emoji to be used when rendering status content.
- url: A link to the status’s HTML representation.
- in_reply_to_id: ID of the status being replied to.
- in_reply_to_account_id: ID of the account that authored the status being replied to.
- reblog: The status being reblogged.
- poll: The poll attached to the status.
- card: Preview card for links included within status content.
- language: Primary language of this status.
- edited_at: Timestamp of when the status was last edited.
These fields are currently NOT accessible via mastodonpy package (as of 09/05/2023) or they might require a token via an authorized user.
- application OPTIONAL: The application used to post this status.
- favourites_count: How many favourites this status has received.
- reblogs_count: How many reblogs this status has received.
- replies_count: How many replies this status has received.
- favourited OPTIONAL: If the current token has an authorized user: Have you favourited this status?
- reblogged OPTIONAL: If the current token has an authorized user: Have you boosted this status?
- muted OPTIONAL: If the current token has an authorized user: Have you muted notifications for this status’s conversation?
- bookmarked OPTIONAL: The number of replies to this status.
- pinned OPTIONAL: If the current token has an authorized user: Have you pinned this status? Only appears if the status is pinnable.
- filtered OPTIONAL: A poll dict if a poll is attached to this status
Currently, the streamer saves Mastodon data in a .json
file. See this link for example JSON of the response.
- Note that some of the data fields (
account
,media_attachments
,mentions
,tags
,emojis
,card
)(along with their subfields) have a variable-nested JSON data structure (i.e., the structure is not consistent across entries). For example, a post could have0...N
tags. Thus if you trying to process into.csv
or delimitated text file (for data analysis) you need to create a function that processes the JSON files dynamically by determining the unique set of keys across all entries and using that to create the CSV header (e.g.,tags_1
,tags_2
, ...tags_N
).
Used Mastodonpy package to connect to the instancehere. For the streaming, we called the StreamListener, onUpdate method to grab the A new status has appeared. status is the parsed status dict describing the status
.
- To connect the instance, install the
pip3 install Mastodon.py
- Import the package
from mastodon import Mastodon, StreamListener
- Here is the reference link to implementation. code