Skip to content

Conversation

@Raptors65
Copy link
Contributor

@Raptors65 Raptors65 commented Nov 5, 2025

Pull Request Checklist

Reference Issue

Please provide the reference to issue this PR is addressing (# followed by the issue number). If there is no associated issue, write "N/A".

ref: N/A

Adds SweRank support

still a work in progress (a lot of this code is messy cursor-generated code), will put up a non-draft PR once that's cleaned up

Checklist Items

Before submitting your pull request, please review these items:

  • Have you followed the contributing guidelines?
  • Have you verified that there are no existing Pull Requests for the same update/change?
  • Have you updated any relevant documentation or added new tests where needed?

PR Type

What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no API changes)
  • Documentation content changes
  • Reproduction logs
  • Other...
    • Description:

@Raptors65 Raptors65 force-pushed the add-swerank-support branch from dd9239f to 08ff53f Compare November 5, 2025 13:33

# Set max document length based on rerank type
if self._rerank_type == "code":
max_doc_length = 1024 # Longer for code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where does this numbers come from?

We normally start with the user provided context size / window size

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is from the original SweRank code but yes I do think it might make more sense to just use the user provided size here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this new file needed? Looks like a copy paste of https://github.com/castorini/rank_llm/blob/main/src/rank_llm/rerank/listwise/rank_listwise_os_llm.py for the most part.
Please either modify that to handle code reranking (preferred) or if you have to create a new class inherit from RankListwiseOSLLM to avoid some of the copy pastes

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, working on that right now. PR is still a work in progress, I'll mark it ready as review once the code is working and in a reasonable state haha

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the difference between this and rank_zephyr_template? why introduce a new one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah didn't notice those were the same, will remove!

choices=["text", "code"],
help="Type of reranking: 'text' for standard passages, 'code' for GitHub issues (SweRank models only)",
)
parser.add_argument(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed? we are deprecating the prompt types, using prompt templates should be enough

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed

min_doc_length = 300
else:
max_doc_length = 300 # Standard for text
min_doc_length = 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does min doc length matter?


keys_and_defaults = [
("context_size", 4096),
("prompt_template_path", None), # Will be set based on rerank_type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the other way around, you need to set the prompt_template_path, rerank_type should not be needed

@sahel-sh
Copy link
Member

sahel-sh commented Nov 5, 2025

@Raptors65 thank you for working on this.

My understanding is that you only need src/rank_llm/rerank/prompt_templates/swerank_github_issue_template.yaml to support github issue/code ranking and optionally a demo file. Nothing else should be needed (including setting the min/max doc length based on the ranking type instead of the user provided window and context sizes).

Please take a look at #313 as an example. Happy to chat more if things are not clear or if you think I am missing some context.

@Raptors65
Copy link
Contributor Author

@sahel-sh thanks for the example PR, that's super helpful! I agree that most of the code here right now isn't necessary, still definitely a WIP; I'll get this cleaned up by end of day. Thanks for the comments!

@Raptors65 Raptors65 force-pushed the add-swerank-support branch from 08ff53f to 3d57870 Compare November 5, 2025 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants