-
Notifications
You must be signed in to change notification settings - Fork 806
openAccessPdf to should map to pdf_url
#1221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.
To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.
Bug: pdf_url field not populated from API mapping
The API mapping change separated openAccessPdf from the url field to its own pdf_url field, but the parsing code in parse_s2_to_doc_details still extracts openAccessPdf and assigns it to the url field instead of pdf_url. This causes the wrong field to be populated in the DocDetails object, misaligning with the documented mapping.
src/paperqa/clients/semantic_scholar.py#L202-L203
paper-qa/src/paperqa/clients/semantic_scholar.py
Lines 202 to 203 in aa3061f
| journal=journal_data.get("name"), | |
| url=(paper_data.get("openAccessPdf") or {}).get("url"), |
Comment @cursor review or bugbot run to trigger another review on this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR corrects the mapping of the Semantic Scholar API's openAccessPdf field from url to pdf_url in the DocDetails model. Previously, openAccessPdf was incorrectly grouped with url in the API mapping.
Key Changes:
- Separated
urlandpdf_urlmappings inSEMANTIC_SCHOLAR_API_MAPPING urlnow maps only to Semantic Scholar'surlfield (the Semantic Scholar page link)pdf_urlnow correctly maps to Semantic Scholar'sopenAccessPdffield (the open access PDF link)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
jamesbraza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice sleuthing
Per title - we had it mapped to
urlNote
Separate Semantic Scholar field mapping so
openAccessPdfmaps topdf_urlandurlmaps only tourl.Written by Cursor Bugbot for commit aa3061f. Configure here.