Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IX] - PDF sidepanel oversized documents #7455

Closed
Zasa-san opened this issue Nov 19, 2024 · 4 comments · Fixed by #7485
Closed

[IX] - PDF sidepanel oversized documents #7455

Zasa-san opened this issue Nov 19, 2024 · 4 comments · Fixed by #7485

Comments

@Zasa-san
Copy link
Collaborator

Zasa-san commented Nov 19, 2024

@aphilop @konzz we seem to have a new problem related to the size of the PDF on the side panel, so we may need to promote this issue to high priority.

Now the document is rendered oversized:

image

Tested on FF and Chrome

Originally posted by @txau in #7393 (comment)

@Zasa-san Zasa-san changed the title @aphilop @konzz we seem to have a new problem related to the size of the PDF on the side panel, so we may need to promote this issue to high priority. [IX] - PDF sidepanel oversized documents Nov 19, 2024
@aphilop aphilop added this to the Information Extraction milestone Nov 19, 2024
@Zasa-san Zasa-san self-assigned this Nov 19, 2024
@Zasa-san
Copy link
Collaborator Author

@konzz The problem seems to be with app/react/V2/Components/PDFViewer/PDFPage.tsx, when rendering the pdf page:

   const pageViewer = new PDFJSViewer.PDFPageView({
          container: currentContainer,
          id: page,
          scale: 1.1,
          defaultViewport,
          annotationMode: 0,
          eventBus: new EventBus(),
        });

The scale has been set to 1.1. This is causing the bigger size on the pdf sidepanel, and the distortion of the text selection.

When rending a pdf in the library we still use the old pdfjs component that render the pdf with a scale of 1.

I think that that discrepancy in scales between components is breaking the selections and making the pdf bigger. I don't see an immediate reason as to why the pdf sidepanel's scale was set to 1.1. The scale should be kept at 1 in my opinion, or at least always match both in pdf sidepanel and entity view.

The pdf sidepanel e2e test is done as part of cypress/e2e/settings/information-extraction.cy.ts. It's checking that the PDF renders, and the selections exist. But there is no image snapshot.

There are other elements of app/react/V2/Components/PDFViewer/PDFPage.tsx & app/react/V2/Components/PDFViewer/PDF.tsx that this e2e is not testing, and likely would not be able to test (like pdfjs security isEvalSupported feature been turned off when instantiating the component). This components where made when we were trying to favor e2e over unit tests. I think that we should add some unit tests to this components to cover the more nuanced characteristics that are difficult or not possible to test with e2e.

This issue and #7393 should be fixed by #7456

@txau
Copy link
Collaborator

txau commented Nov 19, 2024

@Zasa-san the best way to test it is to play with the browser zoom scale, since PDF rendering and scaling varies depending on the screen resolution.

@Zasa-san
Copy link
Collaborator Author

After some testing this seems to affect primarily users who are using some kind of zoom factor either in the browser or the OS display settings.

This is unrelated to #7393 and should not affect selections.

I'm attempting to better handle pdfjs's rendering of pages to solve this zoom problem.

@Zasa-san
Copy link
Collaborator Author

Zasa-san commented Jan 7, 2025

I have not been able to identify at what point this issue started. It could be that a pdfjs-dist update introduced this error. The reason it has gone unnoticed is because we don’t have a good test that covers pdf rendering.

In the PR that fixes this issue I’m adding a dedicated e2e to test pdf in the library and the IX sidepanel. The goal is to test that our pdf rendering remains consistent and functional. I have tried covering this with unit tests but the amount of core pdfjs-dist functionality that needs to be mocked makes it pointless. A dedicated e2e is the best solution.

Bugs not fixed by this pr that exist in production already:

  • References not scrolling the selection into view. Generally this scrolling event behaves inconsistently depending on where on the document the user is.
  • Loading a document in page X does not automatically scroll the document to that page. You can still use pagination and it will navigate to the correct page afterwards. Also, when switching from the regular view to the plain text view and back, the pdf always renders in the first page regardless of what page was the user viewing the plain text for.
  • If the user scrolls to a point where two pages are rendering at the same time, paginating stops working. It's finicky, you have to have a page partially visible in the top or bottom, and that breaks paginating forwards or backwards respectively. Reported here Pagination buttons when viewing a document are not working when browser zoomed out #7561
  • @josh-huridocs reported not being able to copy and paste text with keyboard commands. I have tested this and not found this problem. But please do check. It would make no sense for this PR to break that however.

About existing selections:
If they were made before these adjustments by a user who had the scaling problem they will be broken. In fact, are already broken in development for everyone except the user that made them.
I have noticed some broken text selections in the fixtures that I used for testing this PR, but those selections are not present in production, eg: https://summa.cejil.org/en/entity/c335ipmxofp has a broken text reference in the fixtures I used for testing that is not present in production. I'm assuming @txau did these selections in the testing environment, and himself being a user that has the original bug the selection would be broken for everyone but him.

The PR that fixes this issue should also fix https://github.com/huridocs/Internal-Issues/issues/238. @oleksandrako

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants