Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment on other Wikimedia wikis #37

Merged
merged 4 commits into from
Sep 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion deployment/celery.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ spec:
spec:
containers:
- name: celery
image: docker-registry.tools.wmflabs.org/toollabs-python37-base:latest
image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
command: [ "/data/project/editgroups/www/python/src/tasks.sh" ]
workingDir: /data/project/editgroups/www/python/src
env:
Expand Down
2 changes: 1 addition & 1 deletion deployment/listener.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ spec:
spec:
containers:
- name: listener
image: docker-registry.tools.wmflabs.org/toollabs-python37-base:latest
image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
command: [ "/data/project/editgroups/www/python/src/listener.sh" ]
workingDir: /data/project/editgroups/www/python/src
env:
Expand Down
2 changes: 1 addition & 1 deletion deployment/migrator.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ spec:
spec:
containers:
- name: migrator
image: docker-registry.tools.wmflabs.org/toollabs-python37-base:latest
image: docker-registry.tools.wmflabs.org/toolforge-python37-sssd-base:latest
command: [ "/data/project/editgroups/www/python/src/migrator.sh" ]
workingDir: /data/project/editgroups/www/python/src
env:
Expand Down
61 changes: 43 additions & 18 deletions docs/install.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,55 +69,69 @@ to fill in the following fields:
Once you validate this form, the edits listener will pick up the new tool (after a few minutes) and start ingesting its edits. If you
need to ingest past edits again, you can use the listener script with a date in the past to retrieve the previous edits.

Deploying on WMF Toollabs
-------------------------
Deploying on WMF Toolforge
--------------------------

In what follows we assume that the tool is deployed as the ``editgroups`` project.

- ``become editgroups``
- ``mkdir -p www/python/src``

Put the following contents in ``manifest.template`` in the home directory of the tool::

backend: kubernetes
type: python3.7

Install the dependencies in the virtualenv::

webservice shell
cd www/python
virtualenv venv --python /usr/bin/python3
python3 -m venv venv
source venv/bin/activate
pip install --upgrade pip
git clone https://github.com/Wikidata/editgroups.git src
pip install -r src/requirements.txt

Configure static files::

mkdir -p src/static
ln -s src/static
ln -s src/static .

Put the following content in ``~/www/uwsgi.ini``::

[uwsgi]
check-static = /data/project/editgroups/www/python

and run ``./manage.py collectstatic``.

Create the SQL database:
Create the SQL database (outside of the ``webservice shell``):

- ``sql tools``
- ``CREATE DATABASE s1234__editgroups;`` where ``s1234`` is the SQL username of the tool
- ``CREATE DATABASE s1234__editgroups;`` where ``s1234`` is the SQL username of the tool (can be found in ``~/replica.my.cnf``)
- ``\q``

Configure database access and other settings::

cd ~/www/python/src/editgroups/settings/
echo "from .prod import *" > __init__.py
cp secret_wmflabs.py secret.py
cp secret_toolforge.py secret.py

Edit ``secret.py`` with the user
and password of the table (they can be found in ``~/replica.my.cnf``).
The name of the table is the one you used at creation above
(``s1234__editgroups`` where ``s1234`` is replaced by the username of
the tool).
the tool). Also, pick a secret key to store in ``SECRET_KEY``.

In the ``editgroups/settings/__init__.py`` you can also copy over
settings line from ``editgroups/settings/common.py`` and adapt them to
the wiki that you are running EditGroups for (for instance ``MEDIAWIKI_API_ENDPOINT`` and the following lines).
You should also adapt the allowed hostname (taken from ``editgroups/settings/prod.py``). It's easier
to add those to the ``__init__.py`` file to avoid editing files tracked by Git.

Put the following content in ``~/www/python/uwsgi.ini``::

[uwsgi]
static-map = /static=/data/project/editgroups/www/python/src/static

and run ``./manage.py collectstatic`` in the ``~/www/python/src`` directory.


Configure OAuth login:

- Request an OAuth client id at https://meta.wikimedia.org/wiki/Special:OAuthConsumerRegistration/propose. Beyond the normal editing scopes, you will also need to perform administrative actions (delete, restore) on behalf of users, so make sure you request these scopes too.
- Request an OAuth client id at https://meta.wikimedia.org/wiki/Special:OAuthConsumerRegistration/propose. As OAuth protocol version, use "OAuth 1.0a". As callback URL, use the domain of the tool and tick the box to treat it as a prefix. Beyond the normal editing scopes, you will also need to perform administrative actions (delete, restore) on behalf of users, so make sure you request these scopes too.
- Put the tokens in ``~/www/python/src/editgroups/settings/secret.py``

Migrate the database:
Expand All @@ -126,9 +140,20 @@ Migrate the database:

Run the webserver:

- ``webservice --backend kubernetes python start``
- ``webservice start``

Go to the webservice, login with OAuth to the application. This will create a ``User`` object that you can then mark as staff in the Django shell, as follows::

$ webservice shell
source ~/www/python/venv/bin/activate
cd www/python/src
./manage.py shell
from django.contrib.auth.models import User
user = User.objects.get()
user.is_staff = True
user.save()

Launch the listener and Celery in Kubernetes:
Launch the listener and Celery in Kubernetes. These deployment files may need to be adapted if you are not deploying the tool as the ``editgroups`` toolforge tool but another tool id:

- ``kubectl create -f deployment/listener.yaml``
- ``kubectl create -f deployment/celery.yaml``
Expand Down
4 changes: 2 additions & 2 deletions dump_events.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@
import sys
from sseclient import SSEClient as EventSource
from dateutil import parser
from store.stream import WikidataEditStream
from store.stream import WikiEditStream

if __name__ == '__main__':
s = WikidataEditStream()
s = WikiEditStream()
offset = None
if len(sys.argv) > 1:
offset = parser.parse(sys.argv[1])
Expand Down
20 changes: 20 additions & 0 deletions editgroups/context_processors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from django.conf import settings

def mediawiki_site_settings(request):
return {
'MEDIAWIKI_API_ENDPOINT': settings.MEDIAWIKI_API_ENDPOINT,
'MEDIAWIKI_BASE_URL': settings.MEDIAWIKI_BASE_URL,
'MEDIAWIKI_INDEX_ENDPOINT': settings.MEDIAWIKI_INDEX_ENDPOINT,
'PROPERTY_BASE_URL': settings.PROPERTY_BASE_URL,
'USER_BASE_URL': settings.USER_BASE_URL,
'USER_TALK_BASE_URL': settings.USER_TALK_BASE_URL,
'CONTRIBUTIONS_BASE_URL': settings.CONTRIBUTIONS_BASE_URL,
'WIKI_CODENAME': settings.WIKI_CODENAME,
'USER_DOCS_HOMEPAGE': settings.USER_DOCS_HOMEPAGE,
'MEDIAWIKI_NAME': settings.MEDIAWIKI_NAME,
'DISCUSS_PAGE_PREFIX': settings.DISCUSS_PAGE_PREFIX,
'DISCUSS_PAGE_PRELOAD': settings.DISCUSS_PAGE_PRELOAD,
'REVERT_PAGE': settings.REVERT_PAGE,
'REVERT_PRELOAD': settings.REVERT_PRELOAD,
'WIKILINK_BATCH_PREFIX': settings.WIKILINK_BATCH_PREFIX
}
18 changes: 18 additions & 0 deletions editgroups/settings/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@
"social_django.context_processors.backends",
"social_django.context_processors.login_redirect",
"tagging.filters.context_processor",
"editgroups.context_processors.mediawiki_site_settings",
),
'debug': True
}
Expand Down Expand Up @@ -183,6 +184,23 @@
SOCIAL_AUTH_EMAIL_LENGTH = 190

MEDIAWIKI_API_ENDPOINT = 'https://www.wikidata.org/w/api.php'
MEDIAWIKI_BASE_URL = 'https://www.wikidata.org/wiki/'
MEDIAWIKI_INDEX_ENDPOINT = 'https://www.wikidata.org/w/index.php'
PROPERTY_BASE_URL = MEDIAWIKI_BASE_URL + 'Property:'
USER_BASE_URL = MEDIAWIKI_BASE_URL + 'User:'
USER_TALK_BASE_URL = MEDIAWIKI_BASE_URL + 'User_talk:'
CONTRIBUTIONS_BASE_URL = MEDIAWIKI_BASE_URL + 'Special:Contributions/'
WIKI_CODENAME = 'wikidatawiki'
USER_DOCS_HOMEPAGE = 'https://www.wikidata.org/wiki/Wikidata:Edit_groups'
MEDIAWIKI_NAME = 'Wikidata'
DISCUSS_PAGE_PREFIX = 'Wikidata:Edit_groups/'
DISCUSS_PAGE_PRELOAD = 'Wikidata:Edit_groups/Preload'
REVERT_PAGE = 'Wikidata:Requests_for_deletions'
REVERT_PRELOAD = 'Wikidata:Edit_groups/Revert'
WATCHED_NAMESPACES = [0, 120]

WIKILINK_BATCH_PREFIX = ':toollabs:editgroups/b/'
REVERT_COMMENT_STAMP = ' ([[:toollabs:editgroups/b/EG/{}|details]])'

### Celery config ###
# Celery runs asynchronous tasks such as metadata harvesting or
Expand Down
2 changes: 1 addition & 1 deletion editgroups/settings/prod.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/1.7/howto/static-files/

STATIC_URL = '/editgroups/static/'
STATIC_URL = '/static/'
STATICFILES_DIRS = [
os.path.join(BASE_DIR, "static"),
]
Expand Down
33 changes: 33 additions & 0 deletions editgroups/settings/secret_toolforge.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
import os
from .common import BASE_DIR

# SECURITY WARNING: keep the secret key used in production secret!
SECRET_KEY = 'insert_a_random_hash_here'

# Database
# https://docs.djangoproject.com/en/1.7/ref/settings/#databases
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 's1234__editgroups', # adapt to the database you created
'HOST': 'tools.db.svc.eqiad.wmflabs',
'OPTIONS': {
'init_command': "SET sql_mode='STRICT_TRANS_TABLES'",
'charset': 'utf8mb4',
'read_default_file': os.path.expanduser("~/replica.my.cnf")
},
}
}

# Adapt those to the credentials you got
SOCIAL_AUTH_MEDIAWIKI_KEY = ''
SOCIAL_AUTH_MEDIAWIKI_SECRET = ''
SOCIAL_AUTH_MEDIAWIKI_URL = 'https://www.wikidata.org/w/index.php'
SOCIAL_AUTH_MEDIAWIKI_CALLBACK = 'https://editgroups.toolforge.org/oauth/complete/mediawiki/'

# Redis (if you use it)
REDIS_HOST = 'tools-redis'
REDIS_PORT = 6379
REDIS_DB = 3
REDIS_PASSWORD = ''

2 changes: 1 addition & 1 deletion editgroups/templates/editgroups/common.html
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@
<button type="submit" class="btn btn-default">Submit</button>
</form> -->
<ul class="nav navbar-nav navbar-right">
<li><a href="https://www.wikidata.org/wiki/Wikidata:Edit_groups">About</a></li>
<li><a href="{{ USER_DOCS_HOMEPAGE }}">About</a></li>
{% if not user.is_authenticated %}
<li><a href="{% url "social:begin" 'mediawiki' %}">Login</a></li>
{% else %}
Expand Down
6 changes: 3 additions & 3 deletions listener.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@
import django
django.setup()

from store.stream import WikidataEditStream
from store.stream import WikiEditStream
from store.utils import grouper
from store.models import Edit

print('Listening to Wikidata edits...')
s = WikidataEditStream()
print('Listening to edits...')
s = WikiEditStream()
utcnow = datetime.utcnow()
try:
latest_edit_seen = Edit.objects.order_by('-timestamp')[0].timestamp
Expand Down
10 changes: 3 additions & 7 deletions revert/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ def __str__(self):

def comment_with_stamp(self):
return (self.comment +
' ([[:toollabs:editgroups/b/EG/{}|details]])'.format(self.uid))
settings.REVERT_COMMENT_STAMP.format(self.uid))

def undo_summary(self, edit):
prefix = '/* undo:0||{}|{} */ '.format(edit.newrevid, edit.user)
Expand Down Expand Up @@ -63,13 +63,11 @@ def revert_edit(self, edit):
self.oauth_tokens['oauth_token_secret'])

# Get token
r = requests.get('https://www.wikidata.org/w/api.php', params={
r = requests.get(settings.MEDIAWIKI_API_ENDPOINT, params={
'action':'query',
'meta':'tokens',
'format': 'json',
}, auth=auth)
print('#### GET TOKEN')
print(r.text)
r.raise_for_status()
token = r.json()['query']['tokens']['csrftoken']

Expand Down Expand Up @@ -105,11 +103,9 @@ def revert_edit(self, edit):
'watchlist': 'nochange',
}

r = requests.post('https://www.wikidata.org/w/api.php',
r = requests.post(settings.MEDIAWIKI_API_ENDPOINT,
data=data, auth=auth)

print('#### UNDO EDIT')
print(r.text)
#r.raise_for_status()


Expand Down
2 changes: 1 addition & 1 deletion revert/templates/revert/initiate.html
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

{% block mainBody %}
<div class="page-header">
<h3>Undoing edit group by <a href="https://www.wikidata.org/wiki/User:{{ batch.user }}">{{ batch.user }}</a>: {{ batch.summary }} ({{ batch.uid }})</h3>
<h3>Undoing edit group by <a href="{{ USER_BASE_URL }}{{ batch.user }}">{{ batch.user }}</a>: {{ batch.summary }} ({{ batch.uid }})</h3>
</div>
<div class="revert-dialog">
<p>You are about to undo {{ batch.nb_revertable_edits }} edits{% if batch.nb_undeleted_new_pages %}, which will delete or restore {{ batch.nb_undeleted_new_pages }} items{% endif %}.</p>
Expand Down
12 changes: 6 additions & 6 deletions store/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -274,7 +274,7 @@ def archive_old_batches(cls, batch_inspector):

class Edit(models.Model):
"""
A wikidata edit as returned by the Event Stream API
A MediaWiki edit as returned by the Event Stream API
"""
id = models.IntegerField(unique=True, primary_key=True)
oldrevid = models.IntegerField(null=True)
Expand Down Expand Up @@ -306,16 +306,16 @@ class Meta:

@property
def url(self):
return 'https://www.wikidata.org/wiki/index.php?diff={}&oldid={}'.format(self.newrevid,self.oldrevid)
return '{}?diff={}&oldid={}'.format(settings.MEDIAWIKI_INDEX_ENDPOINT, self.newrevid, self.oldrevid)

@property
def revert_url(self):
if self.oldrevid:
return 'https://www.wikidata.org/w/index.php?title={}&action=edit&undoafter={}&undo={}'.format(self.title, self.oldrevid, self.newrevid)
return '{}?title={}&action=edit&undoafter={}&undo={}'.format(settings.MEDIAWIKI_INDEX_ENDPOINT, self.title, self.oldrevid, self.newrevid)
elif self.changetype == 'delete':
return 'https://www.wikidata.org/wiki/Special:Undelete/{}'.format(self.title)
return '{}Special:Undelete/{}'.format(settings.MEDIAWIKI_BASE_URL, self.title)
else:
return 'https://www.wikidata.org/w/index.php?title={}&action=delete'.format(self.title)
return '{}?title={}&action=delete'.format(settings.MEDIAWIKI_INDEX_ENDPOINT, self.title)

def __str__(self):
return '<Edit {} >'.format(self.url)
Expand Down Expand Up @@ -366,7 +366,7 @@ def ingest_edits(cls, json_batch):
tools = Tool.objects.all()

for edit_json in json_batch:
if not edit_json or edit_json.get('namespace') not in [0,120]:
if not edit_json or edit_json.get('namespace') not in settings.WATCHED_NAMESPACES:
continue
timestamp = datetime.fromtimestamp(edit_json['timestamp'], tz=UTC)

Expand Down
5 changes: 3 additions & 2 deletions store/stream.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
import json
from sseclient import SSEClient as EventSource
from django.conf import settings

class WikidataEditStream(object):
class WikiEditStream(object):
def __init__(self):
self.url = 'https://stream.wikimedia.org/v2/stream/recentchange'
self.wiki = 'wikidatawiki'
self.wiki = settings.WIKI_CODENAME

def stream(self, from_time=None):
url = self.url
Expand Down
Loading