Skip to content

Conversation

@sidnarayanan
Copy link
Collaborator

@sidnarayanan sidnarayanan commented Nov 26, 2025

Per title - we had it mapped to url


Note

Separate Semantic Scholar field mapping so openAccessPdf maps to pdf_url and url maps only to url.

Written by Cursor Bugbot for commit aa3061f. Configure here.

@dosubot dosubot bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Nov 26, 2025
@dosubot
Copy link

dosubot bot commented Nov 26, 2025

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

Bug: pdf_url field not populated from API mapping

The API mapping change separated openAccessPdf from the url field to its own pdf_url field, but the parsing code in parse_s2_to_doc_details still extracts openAccessPdf and assigns it to the url field instead of pdf_url. This causes the wrong field to be populated in the DocDetails object, misaligning with the documented mapping.

src/paperqa/clients/semantic_scholar.py#L202-L203

journal=journal_data.get("name"),
url=(paper_data.get("openAccessPdf") or {}).get("url"),

Fix in Cursor Fix in Web


Comment @cursor review or bugbot run to trigger another review on this PR

@dosubot dosubot bot added the enhancement New feature or request label Nov 26, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR corrects the mapping of the Semantic Scholar API's openAccessPdf field from url to pdf_url in the DocDetails model. Previously, openAccessPdf was incorrectly grouped with url in the API mapping.

Key Changes:

  • Separated url and pdf_url mappings in SEMANTIC_SCHOLAR_API_MAPPING
  • url now maps only to Semantic Scholar's url field (the Semantic Scholar page link)
  • pdf_url now correctly maps to Semantic Scholar's openAccessPdf field (the open access PDF link)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@jamesbraza jamesbraza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice sleuthing

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 26, 2025
@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Nov 26, 2025
@sidnarayanan sidnarayanan merged commit f38bc6a into main Nov 26, 2025
5 of 7 checks passed
@sidnarayanan sidnarayanan deleted the s2-pdf branch November 26, 2025 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants