Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add page count to crawl model #2257

Open
tw4l opened this issue Dec 18, 2024 · 0 comments
Open

Add page count to crawl model #2257

tw4l opened this issue Dec 18, 2024 · 0 comments
Assignees
Labels
back end Requires back end dev work front end Requires front end dev work

Comments

@tw4l
Copy link
Member

tw4l commented Dec 18, 2024

One of the consequences of adding pages from uploads to the database in the backend work for public collections (added in #2198) is that we now have pages read into the database for both crawls and uploads, but in various places throughout the application (e.g. archived items list, list of archived items in a collection), we report the number of pages for crawls but not uploads. This is because what we currently use for a crawl's page count (crawl.stats.done) only exists for crawls.

Now that we have pages in the database for crawls and uploads, we should count these documents per archived item and store the count in the crawls model in a way that is consistent for all archived items, and then use that new count across the frontend where we list page count for archived items.

This will likely require a migration to backfill data as well.

@tw4l tw4l self-assigned this Dec 18, 2024
@ikreymer ikreymer moved this from Triage to Todo in Webrecorder Projects Dec 18, 2024
@tw4l tw4l moved this from Todo to Ready in Webrecorder Projects Dec 18, 2024
@tw4l tw4l added back end Requires back end dev work front end Requires front end dev work labels Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back end Requires back end dev work front end Requires front end dev work
Projects
Status: Ready
Development

No branches or pull requests

1 participant