Skip to content

Correctly handling download_database_on_pipeline_creation within a pipeline processor within a default or final pipeline #131236

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

masseyke
Copy link
Member

We are supposed to load a geoip database even if download_database_on_pipeline_creation is set to false, if it is referenced from a default or final processor. However, if the geoip processor is referenced from a pipeline that is referenced as a pipeline processor from the default or final pipeline, we do not correctly do this. The result is that the database is not downloaded, and all data is tagged with something like _geoip_database_unavailable_GeoLite2-City.mmdb rather than having geo data added to it.
Related: #96624

…peline processor within a default or final pipeline
@masseyke masseyke added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP auto-backport Automatically create backport pull requests when merged v8.19.0 v9.1.0 v9.2.0 v9.0.5 v8.18.5 labels Jul 14, 2025
@masseyke masseyke requested a review from joegallo July 14, 2025 19:18
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Jul 14, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @masseyke, I've created a changelog YAML for you.

Copilot

This comment was marked as outdated.

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a bug where GeoIP databases were not being downloaded when download_database_on_pipeline_creation is set to false but the GeoIP processor is referenced through nested pipeline processors within default or final pipelines. The fix ensures proper recursive traversal of pipeline processors to detect GeoIP processors at any nesting level.

Key changes:

  • Enhanced GeoIP processor detection to recursively traverse pipeline processors
  • Added cycle detection to prevent stack overflow in pipelines with circular references
  • Comprehensive test coverage for nested pipeline scenarios

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
GeoIpDownloaderTaskExecutor.java Core logic enhancement to recursively detect GeoIP processors in nested pipeline processors with cycle detection
GeoIpDownloaderTaskExecutorTests.java New test cases covering nested pipeline scenarios and recursive pipeline detection
docs/changelog/131236.yaml Changelog entry documenting the bug fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team v8.18.5 v8.19.0 v9.0.5 v9.1.0 v9.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants