Skip to content

feat(google-scholar): add cite and profile commands, fix search dedup#1176

Open
Benjamin-eecs wants to merge 1 commit intojackwener:mainfrom
Benjamin-eecs:feat/google-scholar-cite
Open

feat(google-scholar): add cite and profile commands, fix search dedup#1176
Benjamin-eecs wants to merge 1 commit intojackwener:mainfrom
Benjamin-eecs:feat/google-scholar-cite

Conversation

@Benjamin-eecs
Copy link
Copy Markdown
Contributor

Description

Three changes to the google-scholar adapter:

  1. Fix search duplicate results ([Bug]: google-scholar search returns duplicate results #1174). The CSS selector .gs_r.gs_or.gs_scl, .gs_ri matched both the outer container and its inner .gs_ri child, producing every paper twice. Changed to .gs_r.gs_or.gs_scl only.

  2. Add cite command ([Feature]: add cite and profile commands to google-scholar adapter #1175). Get BibTeX/EndNote/RefMan/RefWorks citation for a paper. Searches for the paper, clicks the cite button in search results, and navigates to Google's citation endpoint to fetch the formatted content. Supports --style to choose format and --index to pick which result to cite.

  3. Add profile command ([Feature]: add cite and profile commands to google-scholar adapter #1175). View an author's Google Scholar profile: name, affiliation, h-index, i10-index, total citations, and top papers. Accepts either an author name (searches for the profile) or a 12-char Scholar user ID (navigates directly).

Fixes #1174, closes #1175

Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 🌐 New site adapter
  • 📝 Documentation
  • ♻️ Refactor
  • 🔧 CI / build / tooling

Checklist

  • I ran the checks relevant to this PR
  • I updated tests or docs if needed
  • I included output or screenshots when useful

Documentation (if adding/modifying an adapter)

  • Added doc page under docs/adapters/ (if new adapter)
  • Updated docs/adapters/index.md table (if new adapter)
  • Updated sidebar in docs/.vitepress/config.mts (if new adapter)
  • Updated README.md / README.zh-CN.md when command discoverability changed
  • Used positional args for the command's primary subject unless a named flag is clearly better
  • Normalized expected adapter failures to CliError subclasses instead of raw Error

Screenshots / Output

search (dedup fix):

$ opencli google-scholar search "deep learning" --limit 3
- rank: 1
  title: Deep learning
  authors: Y LeCun, Y Bengio, G Hinton
  cited: '111283'
  year: '2015'
- rank: 2
  title: Deep learning
  authors: I Goodfellow, Y Bengio, A Courville, Y Bengio
  cited: '93860'
  year: '2016'
- rank: 3
  title: Deep learning
  authors: Y Bengio, I Goodfellow, A Courville
  cited: '2873'
  year: '2017'

cite:

$ opencli google-scholar cite "attention is all you need"
- title: Attention is all you need
  format: bibtex
  citation: |-
    @article{vaswani2017attention,
      title={Attention is all you need},
      author={Vaswani, Ashish and Shazeer, Noam and ...},
      journal={Advances in neural information processing systems},
      volume={30},
      year={2017}
    }

profile:

$ opencli google-scholar profile "Geoffrey Hinton" --limit 5
- rank: 0
  title: Geoffrey Hinton (Emeritus Prof. Computer Science, University of Toronto)
  cited: h=190 i10=527 total=1031832
- rank: 1
  title: Imagenet classification with deep convolutional neural networks
  cited: '193904'
  year: '2012'
- rank: 2
  title: Deep learning
  cited: '111283'
  year: '2015'
...

Copilot AI review requested due to automatic review settings April 25, 2026 10:56
@Benjamin-eecs Benjamin-eecs force-pushed the feat/google-scholar-cite branch from 02db0de to 9cc37a4 Compare April 25, 2026 10:59
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds new cite and profile commands to the google-scholar adapter and fixes duplicate search results caused by an overly-broad CSS selector.

Changes:

  • Fix search result deduplication by narrowing the result container selector.
  • Add google-scholar cite to fetch formatted citations (BibTeX/EndNote/RefMan/RefWorks) from Scholar.
  • Add google-scholar profile to fetch author profile metrics and top papers (by name lookup or direct user ID).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File Description
clis/google-scholar/search.js Narrows the DOM selector to avoid double-counting results.
clis/google-scholar/cite.js Introduces a new command to open the cite dialog and fetch formatted citation content.
clis/google-scholar/profile.js Introduces a new command to navigate to an author profile and extract stats + papers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +4 to +19
cli({
site: 'google-scholar',
name: 'profile',
description: 'View a Google Scholar author profile',
domain: 'scholar.google.com',
strategy: Strategy.PUBLIC,
browser: true,
args: [
{ name: 'author', positional: true, required: true, help: 'Author name or Scholar user ID (e.g. JicYPdAAAAAJ)' },
{ name: 'limit', type: 'int', default: 10, help: 'Max papers to show (max 20)' },
],
columns: ['rank', 'title', 'cited', 'year'],
navigateBefore: false,
func: async (page, kwargs) => {
const author = requireNonEmptyQuery(kwargs.author, 'author');
const limit = clampInt(kwargs.limit, 10, 1, 20);
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests for the new google-scholar/profile command. Since this adapter already has vitest coverage for command registration and pre-navigation argument validation (clis/google-scholar/search.test.js), add a profile.test.js to verify the command is registered correctly and that empty/whitespace author input is rejected before calling page.goto.

Copilot uses AI. Check for mistakes.
Comment thread clis/google-scholar/cite.js Outdated
cli({
site: 'google-scholar',
name: 'cite',
description: 'Get citation (BibTeX) for a Google Scholar paper',
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The command description says it gets citation “(BibTeX)”, but the command supports multiple formats via --style (endnote/refman/refworks). Update the description to reflect all supported formats so opencli ... --help is accurate.

Suggested change
description: 'Get citation (BibTeX) for a Google Scholar paper',
description: 'Get citation for a Google Scholar paper',

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +18
cli({
site: 'google-scholar',
name: 'cite',
description: 'Get citation (BibTeX) for a Google Scholar paper',
domain: 'scholar.google.com',
strategy: Strategy.PUBLIC,
browser: true,
args: [
{ name: 'query', positional: true, required: true, help: 'Paper title to search for' },
{ name: 'style', default: 'bibtex', choices: ['bibtex', 'endnote', 'refman', 'refworks'], help: 'Citation format' },
{ name: 'index', type: 'int', default: 1, help: 'Which search result to cite (1-based)' },
],
columns: ['title', 'format', 'citation'],
navigateBefore: false,
func: async (page, kwargs) => {
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests for the new google-scholar/cite command. This repo already uses vitest for CLI command registration/argument validation (e.g., clis/google-scholar/search.test.js); please add a similar test file to cover command registration and that empty/whitespace queries are rejected before any browser navigation.

Copilot uses AI. Check for mistakes.
@Benjamin-eecs Benjamin-eecs force-pushed the feat/google-scholar-cite branch 4 times, most recently from 7b8cb3a to c493d49 Compare April 26, 2026 03:30
- cite: get BibTeX/EndNote/RefMan/RefWorks citation for a paper.
  Clicks the cite button in search results and fetches the citation
  content from Google's citation endpoint.

- profile: view an author's Google Scholar profile (h-index,
  i10-index, citation count, top papers). Accepts author name
  or Scholar user ID.

- search: fix duplicate results caused by CSS selector matching
  both outer container (.gs_r.gs_or.gs_scl) and inner child
  (.gs_ri) for each paper.

Closes jackwener#1174, closes jackwener#1175
@Benjamin-eecs Benjamin-eecs force-pushed the feat/google-scholar-cite branch from c493d49 to f087019 Compare April 26, 2026 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: add cite and profile commands to google-scholar adapter [Bug]: google-scholar search returns duplicate results

2 participants