- Dependencies
- Docker containers
- How to use it
- Queue processor
- Service configuration
- Get service logs
- Set up environment for development
- Execute tests
-
Docker 20.10.14 install link
-
Docker-compose 2.4.1
Note: On mac Docker-compose is installed with Docker
A redis server is needed to use the service asynchronously. For that matter, it can be used the
command ./run start:testing
that has a built-in
redis server.
Containers with ./run start
Containers with ./run start:testing
-
Add the Twitter key to the configuration file
src/config.yml
twitter_bearer_token: [TOKEN]
-
Start the service with docker compose
./run start
- Add a message to the tasks redis queue with the following format
queue = RedisSMQ(host='127.0.0.1', port='6579', qname='twitter_crawler_tasks', quiet=False)
queue.sendMessage(delay=0).message('{"tenant": "tenant_name", "task": "get_tweet", "params": {"query": "@user_handler or #hashtag", "tweets_languages": ["en"]}}').execute()
- Get results from the results queue
queue = RedisSMQ(host='127.0.0.1', port='6579', qname='twitter_crawler_results', quiet=False)
results_message = queue.receiveMessage().exceptions(False).execute()
# Each crawled tweet is placed in the results queue as a json with the following format
# { "tenant": "str"
# "task": "str"
# "params": {"created_at": "int",
# "user": {
# "author_id": "int",
# "name": "str",
# "alias": "str",
# "display_name": "str",
# "url": "str",
# },
# "text": str,
# "images_urls": List[str],
# "source": "str",
# "hashtags": List[str],
# "title": "str",
# "tweet_id": "int"},
# "success": "bool",
# "error_message": "str",
# "data_url": "str",
# "file_url": "str"
# }
- Stop the service
./run stop
The container Queue processor
is coded using Python 3.9, and it is on charge of the communication with redis.
The code can be founded in the file QueueProcessor.py
and it uses the library RedisSMQ
to interact with the redis
queues.
A configuration file could be provided to set the redis server parameters and the twitter bearer token. If a configuration is not provided, the defaults values are used.
The configuration could be manually created, or it can be used the following script:
python3 -m pip install graypy~=2.1.0 PyYAML~=5.4.1
python3 ServiceConfig.py
Configuration file name: src/config.yml
Default parameters:
twitter_bearer_token:
redis_host: 127.0.0.1
redis_port: 6379
mongo_host: 127.0.0.1
mongo_port: 29017
graylog_ip:
The service logs are stored by default in the files docker_volume/redis_tasks.log
and docker_volume/redis_tasks.log
To use a graylog server, add the following line to the config.yml
file:
graylog_ip: [ip]
It works with Python 3.9 [install] (https://runnable.com/docker/getting-started/)
./run install_venv
./run test