-
-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(recap): Enable appellate PDF purchases #4948
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple little things from me, but I leave it to @albertisfu to do the full review. Thank you both!
This commit replaces direct court_id retrieval from the db in the appellate court check within the if statement with the existing pacer_court_id variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good and works properly for purchasing appellate documents without attachments.
But I tested purchasing a main appellate document with attachments and the purchase failed. However, the issue seems to lie in Juriscraper, where the download_pdf
method in AppellateDocketReport
does not use the make_doc1_url
method to build the document URL.
As a result, it does not change the four-digit from 0 to 1 and instead of retrieving the PDF, it retrieves the attachment page, causing the process to fail.
@albertisfu Thanks for the review!
I believe this behavior is correct. Appellate attachments don't have a main document. Therefore, if a user tries to use the Here's an example (I uninstalled the extension to prevent the entry from being updated): The first entry from case 25-1055 has attachments, but it's currently listed as a regular entry. This is because we haven't yet processed an upload that would update the entry with the proper attachment metadata If a user requests to purchase the document for this entry, CL will use Furthermore, when examining the attachment page, you'll find that CL would have bought Attachment 3 if we changed the fourth digit. On the other hand, since you brought this up, I tried buying attachments from an entry we already show as having attachments, and it didn't work. So, I think we can fix this by issue doing the following:
def download_pacer_pdf_by_rd(
rd_pk: int,
pacer_case_id: str,
pacer_doc_id: int,
session_data: SessionData,
magic_number: str | None = None,
de_seq_num: str | None = None,
) -> tuple[Response | None, str]:
...
rd = RECAPDocument.objects.get(pk=rd_pk)
pacer_court_id = map_cl_to_pacer_id(rd.docket_entry.docket.court_id)
s = ProxyPacerSession(
cookies=session_data.cookies, proxy=session_data.proxy_address
)
if is_appellate_court(pacer_court_id):
report = AppellateDocketReport(pacer_court_id, s)
+ pacer_doc_id = (
+ pacer_doc_id
+ if not rd.attachment_number
+ else f"{pacer_doc_id[:3]}1{pacer_doc_id[4:]}"
+ )
r, r_msg = report.download_pdf(
pacer_doc_id=pacer_doc_id, pacer_case_id=pacer_case_id
)
else:
report = FreeOpinionReport(pacer_court_id, s)
r, r_msg = report.download_pdf(
pacer_case_id, pacer_doc_id, magic_number, de_seq_num=de_seq_num
)
return r, r_msg let me know what you think. |
Thanks @ERosendo for the details. I recreated the process using the example you provided, and I believe this is not an issue:
This is because it’s possible for the So, if a user buys the "main document" (which is actually Attachment 3) when the attachment data is not available, the PDF for Attachment 3 will still be correctly retrieved when changing the fourth digit from 0 to 1. However, it will initially be displayed as the "Main document" until the attachment metadata is retrieved: Once the attachment metadata is received, that "Main document" will be converted to Do you think this can still be a problem? I agree with your logic for fixing the |
@albertisfu Thanks for looking into this. However, I'm still hesitant about purchasing and using this document as the main document for the following reasons:
I think we should try adding an error message in Juriscraper that informs users that the document might be an attachment and that they should retry the fetch request. This would help maintain data accuracy. What do you think? |
Yeah, those are valid concerns! I think adding an error message in Juriscraper would be helpful.
|
@albertisfu Here's the juriscraper PR: freelawproject/juriscraper#1309 In commit 0f6ad2e, I added a test to ensure we handle failed purchases correctly by adding the error message to the processing queue. The Juriscraper PR is just adding another error message for a different scenario. I believe it's safe to merge this PR independently of the juriscraper PR. |
Thanks. I took a look to your Juriscraper PR https://github.com/freelawproject/juriscraper/pull/1309/files which modifies the error message for failed purchases that returned an attachment page instead. I’m not entirely sure about the error message you’re using since it mentions "our system," which refers to CourtListener. Considering that Juriscraper is widely used outside of CL, I was thinking of a more generic message like:
This message could then be used in CL to assign the FQ message to a more detailed one for CL API users, such the one you have in Juriscraper:
What do you think? If you agree, should the logic to enrich the FQ message be added in this PR? Additionally, will the logic you suggested in #4948 (comment) to handle appellate attachment PDF purchases be included in a separate PR? |
@albertisfu I like your suggestion. Let's add that logic in this PR.
I apologize! I was under the impression that I had pushed those changes along with the new test |
This commit adds logic to adjust the pacer_doc_id used for purchasing PDF attachments
63c9b50
to
1b184ae
Compare
@albertisfu I've implemented your suggestions, including the error message and purchase attachment tweak. Ready for another review! |
I'm ducking out of this one and leaving it to you guys. You're doing great! If you are both happy, let's do it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @ERosendo! This looks pretty close just a small comment, and I think we're ready to go.
Also, I've merged the Juriscraper PR. We should just remember to do a release before appellate attachment purchases are enabled.
- Refactored download_pacer_pdf_by_rd to improve type hints for pacer_doc_id. - Fixed the logic for handling attachment document purchases by simplifying the process of updating the fourth digit of the pacer_doc_id.
Thanks @ERosendo this looks great now! Set to auto-merge. |
This PR addresses issue #4861
Key Changes:
Enhanced download_pacer_pdf_by_rd method: This method now incorporates a check for the court_id of the recap document. Based on the court_id, the method adjusts its behavior to ensure proper PDF downloads.
Adds a custom error message when trying to purchase ACMS PDFs.
I created a separate issue to review and explore potential methods for purchasing PDFs from ACMS courts.