Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add stop words to Meili search index #1307

Draft
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

LukasKalbertodt
Copy link
Member

Filtering out very common words that carry basically no information improves indexing performance, shrinks the index and most importantly: helps with the problem that searching for common words results in tons of matches in subtitles and such. This doesn't completely solve the latter problem though. And using stop words also makes things worse unfortunately: especially in phrase search, the highlighting is broken and might confuse users. Phrase search still kind of works but from reading the docs, I think with stop search "the" and "a", searching for "foo the bar" will also find documents with the text "foo a bar".

So it's not really clear yet whether we want that at all. Maybe Meili needs to improve first. Or we never send the stop words to Meili and only use them to filter some stuff manually?

Filtering out very common words that carry basically no information
improves indexing performance, shrinks the index and most importantly:
helps with the problem that searching for common words results in tons
of matches in subtitles and such. This doesn't completely solve the
latter problem though. And using stop words also makes things worse
unfortunately: especially in phrase search, the highlighting is broken
and might confuse users. Phrase search still kind of works but from
reading the docs, I think with stop search "the" and "a", searching for
"foo the bar" will also find documents with the text "foo a bar".

So it's not really clear yet whether we want that at all. Maybe Meili
needs to improve first. Or we never send the stop words to Meili and
only use them to filter some stuff manually?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant