Skip to content

SOLR-17788: Abort replication if max retries is reached to fetch a file #3389

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nilesh892003
Copy link

@nilesh892003 nilesh892003 commented Jun 16, 2025

https://issues.apache.org/jira/browse/SOLR-17788

Description

introducing an early abort in the replication process when the maximum retry limit for fetching a file is reached.

its a tricky issue to reproduce as.

  1. fetchLatestIndex is called and IndexFetcher populates the list of filesToDownload map
  2. before it can download all the files leader changes index and cleans up the index files.
  3. fetchFile().Fetch() keeps trying in infinite loop to get hold of all the files and it goes into hung replication.

Solution

  • inserts a SolrException catch block to log the abort and set an aborted flag
  • Rethrows the exception to stop further retries
  • perform fsync in finally if we have fully downloaded the index.

Tests

apache solr lacks junit for entire IndexFetcher. hence made this change in custom version of Solr that we run in production to verify the fix.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended, not available for branches on forks living under an organisation)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant