Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mentions PoC #9279

Draft
wants to merge 21 commits into
base: main
Choose a base branch
from
Draft

Mentions PoC #9279

wants to merge 21 commits into from

Conversation

mtomilov
Copy link
Contributor

@mtomilov mtomilov commented Jan 27, 2025

Refs #9281

Adds @mentions support in the API, emails will be addressed separately.
All of create / get / edit / search endpoints now support them.

Testing API

  • Run migrations make db
  • Login as devdata_admin and generate API token from http://localhost:5000/account/developer
  • Create annotation
     curl -X POST --location "http://localhost:5000/api/annotations" \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
             "uri": "https://www.google.com/",
             "text": "**hello** <span>world</span>"
         }'
    
    Mentions can be already put in the text here and will be returned in the response.
  • Edit annotation
     curl -X PATCH --location "http://localhost:5000/api/annotations/<url safe id>" \
         -H "Authorization: Bearer <token>" \
         -H "Content-Type: application/json" \
         -d '{
                 "text": "Hello <a data-hyp-mention data-userid=\"acct:devdata_admin@localhost\">@devdata_admin</a>, take a look at this\nHello <a data-hyp-mention data-userid=\"acct:devdata_user@localhost\">@devdata_user</a>, take a look at this"
             }'
    
    But I think it's easier to test via edit on already created annotation.
  • Search annotations
    curl -X GET --location "http://localhost:5000/api/search" \ -H "Authorization: Bearer <token>"
    Mentions will be returned here as well

Testing emails

Notifications are sent only on creation for now

  • Create annotation
     curl -X POST --location "http://localhost:5000/api/annotations" \
     -H "Authorization: Bearer <token>" \
     -H "Content-Type: application/json" \
     -d '{
             "uri": "https://www.google.com/",
             "text": "Hello <a data-hyp-mention data-userid=\"acct:devdata_admin@localhost\">@devdata_admin</a>, take a look at this\nHello <a data-hyp-mention data-userid=\"acct:devdata_user@localhost\">@devdata_user</a>, take a look at this"
         }'
    
  • Check h/mail folder for incoming mail

@@ -54,7 +54,12 @@ def _annotation_search_query(
query = query.where(Annotation.id.in_(ids))

if eager_load:
query = query.options(*(subqueryload(prop) for prop in eager_load))
for prop in eager_load:
if isinstance(prop, tuple) and len(prop) == 2:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick and dirty query optimization to load all mentions at once

userids = set(self._parse_userids(annotation.text))
users = self._user_service.fetch_all(userids)
self._session.execute(
sa.delete(Mention).where(Mention.annotation_id == annotation.slim.id)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I should've put mentions on Annotation model directly instead of its slim sibling

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think they should be linked with the regular annotation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnnotationSlim has better FKs to user, group etc.

I think it should be the default go to for new functionality.

That being said if we are using AnnotationSlim at the model we should also use it a the service level, ie, this method should take one annotation_slim: AnnotationSlim.

We have used AnnotasionSlim for features that mostly make expensive queries, it might be a bit awkward to use it something like this where you have to keep track of both tables. IMO that's not that bad if most of the code around mentions uses AnnotationSlim and the callers are the ones that pass the right type of model.

As more features that use AnnotationSlim it should be easier to make it the default but I don't think is a hard requirement for mentions to use it, if actually using it becomes a problem there's no issue of going back to Annotation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me rethink that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switched to Annotation to try it out


@staticmethod
def _parse_userids(text: str) -> list[str]:
return USERID_PAT.findall(text)
Copy link
Contributor Author

@mtomilov mtomilov Jan 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll probably want to parse userids from text_rendered html.

@mtomilov mtomilov changed the title Mentions POC Mentions PoC Jan 27, 2025
@mtomilov mtomilov requested a review from marcospri January 27, 2025 16:16
"mentions": {
"type": "array",
"items": copy.deepcopy(MENTION_SCHEMA),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used to validate annotations on creation? If so, I was not expecting the mentions list to be sent on creation, only to be returned on fetch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this seems to be for validation.
@marcospri Could you please point me to where we describe response schema?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the answer. We have

https://github.com/hypothesis/h/blob/mentions-poc/docs/_extra/api-reference/schemas/annotation.yaml#L1

which seems to be part of the docs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I've removed it from here and added to the yaml.

from h.models import Annotation, Mention
from h.services.user import UserService

USERID_PAT = re.compile(r"data-userid=\"([^\"]+)\"")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a bit too loose, but could work for prototyping

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In future, I think someone suggested the idea of extracting the ids from the annotation rendered HTML, via DOM selector (html.find('a[data-hyp-mention]).dataset.userid)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess we use a real parser for the real production code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I just noticed you commented precisely this here #9279 (comment)

userids = set(self._parse_userids(annotation.text))
users = self._user_service.fetch_all(userids)
self._session.execute(
sa.delete(Mention).where(Mention.annotation_id == annotation.slim.id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think they should be linked with the regular annotation.



def upgrade() -> None:
op.add_column("mention", sa.Column("username", sa.UnicodeText(), nullable=False))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand what this is for. Is it related to supporting changing usernames?

Are we going to support that on v1?

I wonder if we should store the history of usernames at the user level instead of at the mentions one, ie a table/column with the history of a user's usernames.

Copy link
Contributor Author

@mtomilov mtomilov Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it doesn't have to be in v1, removed for now.

As for full support sure, I was thinking something along the lines of UserDelete table but for renames in addition to storing usernames here. Then we'd be able to handle all the edge cases.

@@ -28,5 +28,8 @@ class Mention(Base, Timestamps): # pragma: nocover
"""FK to user.id"""
user = sa.orm.relationship("User")

username = sa.Column("username", sa.UnicodeText(), nullable=False)
"""Current username of the user mentioned"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it "current" or the one that appears on the annotation text?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, might not have been the best name.

userids = set(self._parse_userids(annotation.text))
users = self._user_service.fetch_all(userids)
self._session.execute(
sa.delete(Mention).where(Mention.annotation_id == annotation.slim.id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AnnotationSlim has better FKs to user, group etc.

I think it should be the default go to for new functionality.

That being said if we are using AnnotationSlim at the model we should also use it a the service level, ie, this method should take one annotation_slim: AnnotationSlim.

We have used AnnotasionSlim for features that mostly make expensive queries, it might be a bit awkward to use it something like this where you have to keep track of both tables. IMO that's not that bad if most of the code around mentions uses AnnotationSlim and the callers are the ones that pass the right type of model.

As more features that use AnnotationSlim it should be easier to make it the default but I don't think is a hard requirement for mentions to use it, if actually using it becomes a problem there's no issue of going back to Annotation.

import re
from typing import Iterable, Sequence

import sqlalchemy as sa
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't be shy about importing anything from sqlalchemy. I mean speling the whole from sqlalchemy import select, ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, is this the recommended way?

from h.models import Annotation, Mention
from h.services.user import UserService

USERID_PAT = re.compile(r"data-userid=\"([^\"]+)\"")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I guess we use a real parser for the real production code.

@@ -151,6 +160,8 @@ def present_all_for_user(self, annotation_ids, user: User):
# which ultimately depends on group permissions, causing a
# group lookup for every annotation without this
Annotation.group,
# Optimise access to the mentions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the main pain point for using AnnotationSlim vs Annotation

Copy link
Contributor Author

@mtomilov mtomilov Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, feels a bit awkward as is

"mentions": {
"type": "array",
"items": copy.deepcopy(MENTION_SCHEMA),
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know the answer. We have

https://github.com/hypothesis/h/blob/mentions-poc/docs/_extra/api-reference/schemas/annotation.yaml#L1

which seems to be part of the docs.

@@ -71,6 +76,10 @@ def present(self, annotation: Annotation):
"target": annotation.target,
"document": DocumentJSONPresenter(annotation.document).asdict(),
"links": self._links_service.get_all(annotation),
"mentions": [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want mentions to be in all responses or only on the ones that do have mentions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed from here



def upgrade() -> None:
op.drop_column("mention", "annotation_id")
Copy link
Contributor Author

@mtomilov mtomilov Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentions aren't being used yet so nothing actually destructive here

return [notification.mentioned_user.email], subject, text, html


def _get_user_url(user, request):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from emails/reply_notifcation.py for now

document: Document


def get_notifications(request, annotation: Annotation, action) -> list[Notification]:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly follows notification/reply.py

return []

notifications = []
for mention in annotation.mentions:
Copy link
Contributor Author

@mtomilov mtomilov Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will need to factor in groups / permissions / limits here as well to avoid sending notifications when not appropriate.

commented:
</p>

<blockquote>{{ annotation.text or "" }}</blockquote>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the same approach we already have in emails/reply_notification.html.jinja2

def update_mentions(self, annotation: Annotation) -> None:
self._session.flush()

userids = set(self._parse_userids(annotation.text))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making sure that single user is mentioned only once

@mtomilov mtomilov requested review from acelaya and marcospri January 29, 2025 15:31
commented:
</p>

<blockquote>{{ annotation.text|striptags or "" }}</blockquote>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stripping away our custom mention tags

@@ -89,3 +89,28 @@ def send_reply_notifications(event):
except OperationalError as err: # pragma: no cover
# We could not connect to rabbit! So carry on
report_exception(err)


@subscriber(AnnotationEvent)
Copy link
Contributor Author

@mtomilov mtomilov Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option could be creating a new MentionEvent instead

@@ -21,7 +21,7 @@
commented:
</p>

<blockquote>{{ parent.text or "" }}</blockquote>
<blockquote>{{ parent.text|striptags or "" }}</blockquote>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have custom tags in annotations we need to strip them in replies as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants