Skip to content

Optimize distribute pipeline#1198

Open
JulienPeloton wants to merge 3 commits into
masterfrom
issue/1164/optimize-distribute-pipeline
Open

Optimize distribute pipeline#1198
JulienPeloton wants to merge 3 commits into
masterfrom
issue/1164/optimize-distribute-pipeline

Conversation

@JulienPeloton

Copy link
Copy Markdown
Member

IMPORTANT: Please create an issue first before opening a Pull Request.
Linked to issue(s): Closes #1164

What changes were proposed in this pull request?

Merge conflicts from #1167 and #1186

How was this patch tested?

CI

machterMassi06 and others added 2 commits June 3, 2026 07:26
* implementation with 1 writeStream process (spark-kafka) instead of N before

* [rubin] 1 writeStream process (spark-kafka) instead of N before, keep N HBase writeStreams

* Update writeStream configuration to derive topic from column instead of static options

* PEP8/Ruff

* Fix Kafka distribution issue with npart and validate streaming behavior

* set npart to 10

* set npart=10 in the same way as rubin/distribute

---------

Co-authored-by: Julien <peloton@lal.in2p3.fr>
@sonarqubecloud

sonarqubecloud Bot commented Jun 3, 2026

Copy link
Copy Markdown

Quality Gate Failed Quality Gate failed

Failed conditions
7.9% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[rubin] [ztf] [optimization] Rewriting the distribute (spark-kafka) pipeline to improve CPU and RAM efficiency

2 participants