Skip to content

feat(taps): Introduced method in Tap class to convert input catalog to Stream objects, without discovery#3324

Draft
edgarrmondragon wants to merge 2 commits into
mainfrom
feat/streams-from-catalog-2
Draft

feat(taps): Introduced method in Tap class to convert input catalog to Stream objects, without discovery#3324
edgarrmondragon wants to merge 2 commits into
mainfrom
feat/streams-from-catalog-2

Conversation

@edgarrmondragon
Copy link
Copy Markdown
Collaborator

@edgarrmondragon edgarrmondragon commented Oct 21, 2025

This clarifies whether the tap should perform discovery or use the input catalog, the two being mutually exclusive.

Supersedes #3084

Summary by Sourcery

Provide a clear separation between discovery and catalog-driven stream loading by adding load_streams_from_catalog and updating streams property, and enhance Catalog with a has_stream_selected helper

New Features:

  • Introduce load_streams_from_catalog method in Tap to recreate streams from a provided catalog without discovery
  • Update Tap.streams property to choose between discovery and input catalog processing
  • Add has_stream_selected method to Catalog for checking if a stream is selected

Enhancements:

  • Add Iterable import under TYPE_CHECKING in Tap and template for consistent typing

@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Oct 21, 2025

Reviewer's Guide

This PR refactors stream loading in Tap to distinguish between discovery and catalog-driven workflows by introducing a new load_streams_from_catalog method and branching logic in streams(), adds a helper to check stream selection in Catalog, and updates TYPE_CHECKING imports for Iterable.

Sequence diagram for Tap.streams() branching logic

sequenceDiagram
    participant Tap
    participant Catalog
    participant Stream
    Tap->>Tap: streams()
    alt input_catalog is None
        Tap->>Tap: load_streams()
        Tap->>Stream: create Stream objects
        Tap->>Tap: store streams by name
    else input_catalog is provided
        Tap->>Tap: load_streams_from_catalog(input_catalog)
        Tap->>Stream: create Stream objects from catalog
        Tap->>Stream: apply_catalog(input_catalog)
        Tap->>Tap: store streams by name
    end
Loading

Class diagram for updated Tap class and Catalog

classDiagram
    class Tap {
        +streams: dict[str, Stream]
        +load_streams() list[Stream]
        +load_streams_from_catalog(catalog: Catalog) Iterable[Stream]
    }
    class Catalog {
        +get_stream(stream_id: str) CatalogEntry | None
        +has_stream_selected(stream_id: str) bool
    }
    Tap --> Catalog : uses
    Catalog --> CatalogEntry : returns
Loading

Class diagram for new has_stream_selected method in Catalog

classDiagram
    class Catalog {
        +has_stream_selected(stream_id: str) bool
    }
    class CatalogEntry {
        +metadata
    }
    Catalog --> CatalogEntry : get_stream(stream_id)
    CatalogEntry --> Metadata : metadata
    class Metadata {
        +resolve_selection() mask
    }
Loading

File-Level Changes

Change Details Files
Conditional stream loading based on presence of input_catalog
  • Replaced unconditional load_streams() loop with conditional logic
  • When no catalog, build streams via load_streams()
  • When catalog provided, use load_streams_from_catalog and apply_catalog on each stream
singer_sdk/tap_base.py
Added load_streams_from_catalog method
  • Defined method signature returning Iterable[Stream]
  • Provided docstring explaining backward compatibility
  • Delegates to existing load_streams()
singer_sdk/tap_base.py
Added has_stream_selected utility in Catalog
  • Introduced has_stream_selected() returning metadata selection mask
  • Retrieves stream via get_stream and resolves its selection
singer_sdk/singerlib/catalog.py
Updated TYPE_CHECKING imports for Iterable
  • Imported Iterable from collections.abc in tap_base.py
  • Mirrored the import in the cookiecutter template
singer_sdk/tap_base.py
cookiecutter/tap-template/{{cookiecutter.tap_id}}/{{cookiecutter.library_name}}/non-sql-tap.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@edgarrmondragon edgarrmondragon changed the title feat(tap): Introduced method in Tap class to convert input catalog to Stream objects, without discovery feat(taps): Introduced method in Tap class to convert input catalog to Stream objects, without discovery Oct 21, 2025
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `singer_sdk/tap_base.py:414-425` </location>
<code_context>
             reverse=False,
         )

+    def load_streams_from_catalog(self, catalog: Catalog) -> Iterable[Stream]:  # noqa: ARG002
+        """Recreate streams from the input catalog.
+
+        Returns:
+            Iterable[Stream]: An iterable of Stream objects recreated from the input
+                catalog.
+        For backwards compatibility calls `self.load_streams()`
+
+        Returns:
+            A mapping of names to streams, using provided catalog.
+        """
+        return self.load_streams()
+
     # Bookmarks and state management
</code_context>

<issue_to_address>
**suggestion:** The new method load_streams_from_catalog currently ignores the catalog argument.

Consider updating the implementation to use the catalog argument, or clarify in the docstring that it is currently unused.

```suggestion
    def load_streams_from_catalog(self, catalog: Catalog) -> Iterable[Stream]:  # noqa: ARG002
        """Recreate streams from the input catalog.

        Note:
            The `catalog` argument is currently unused. For backwards compatibility,
            this method calls `self.load_streams()`.

        Returns:
            Iterable[Stream]: An iterable of Stream objects recreated from the input
                catalog.
        """
        return self.load_streams()
```
</issue_to_address>

### Comment 2
<location> `singer_sdk/tap_base.py:168-172` </location>
<code_context>
-
-            for stream in self.load_streams():
-                if input_catalog is not None:
+            if input_catalog is None:
+                self._streams = {stream.name: stream for stream in self.load_streams()}
+            else:
+                self._streams = {}
+                for stream in self.load_streams_from_catalog(input_catalog):
                     stream.apply_catalog(input_catalog)
-                self._streams[stream.name] = stream
</code_context>

<issue_to_address>
**suggestion:** The logic for stream loading now depends on input_catalog, but load_streams_from_catalog currently delegates to load_streams.

If catalog-based filtering is intended, implement it in load_streams_from_catalog or clarify the expected override behavior in documentation.
</issue_to_address>

### Comment 3
<location> `singer_sdk/singerlib/catalog.py:436-445` </location>
<code_context>
         """
         return self.get(stream_id)
+
+    def has_stream_selected(self, stream_id: str) -> bool:
+        """Returns true if the stream is selected.
+
+        Args:
+            stream_id: The ID/name of the stream.
+        """
+        if stream := self.get_stream(stream_id):
+            mask = stream.metadata.resolve_selection()
+            return mask[()]
+
+        return False
</code_context>

<issue_to_address>
**issue (bug_risk):** has_stream_selected may raise if resolve_selection returns a non-dict or mask is missing the expected key.

Add checks to confirm resolve_selection returns a dict and mask contains the expected key to prevent runtime errors from malformed input.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread singer_sdk/tap_base.py
Comment thread singer_sdk/tap_base.py
Comment on lines +436 to +445
def has_stream_selected(self, stream_id: str) -> bool:
"""Returns true if the stream is selected.

Args:
stream_id: The ID/name of the stream.
"""
if stream := self.get_stream(stream_id):
mask = stream.metadata.resolve_selection()
return mask[()]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): has_stream_selected may raise if resolve_selection returns a non-dict or mask is missing the expected key.

Add checks to confirm resolve_selection returns a dict and mask contains the expected key to prevent runtime errors from malformed input.

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Oct 21, 2025

Documentation build overview

📚 Meltano SDK | 🛠️ Build #32409748 | 📁 Comparing 373758d against latest (bad876d)

  🔍 Preview build  

3 files changed
± genindex.html
± classes/singer_sdk.Tap.html
± classes/singer_sdk.sql.SQLTap.html

@edgarrmondragon edgarrmondragon force-pushed the feat/streams-from-catalog-2 branch from 2f61b57 to 923687d Compare October 21, 2025 22:21
@edgarrmondragon edgarrmondragon marked this pull request as draft October 21, 2025 22:21
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Oct 21, 2025

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing feat/streams-from-catalog-2 (373758d) with main (bad876d)

Open in CodSpeed

@codecov
Copy link
Copy Markdown

codecov Bot commented Oct 21, 2025

Codecov Report

❌ Patch coverage is 66.66667% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.79%. Comparing base (bad876d) to head (373758d).

Files with missing lines Patch % Lines
singer_sdk/singerlib/catalog.py 20.00% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3324      +/-   ##
==========================================
- Coverage   93.85%   93.79%   -0.06%     
==========================================
  Files          73       73              
  Lines        5907     5915       +8     
  Branches      725      726       +1     
==========================================
+ Hits         5544     5548       +4     
- Misses        270      274       +4     
  Partials       93       93              
Flag Coverage Δ
core 82.23% <66.66%> (-0.05%) ⬇️
end-to-end 75.40% <66.66%> (-0.04%) ⬇️
optional-components 42.72% <16.66%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@edgarrmondragon edgarrmondragon force-pushed the feat/streams-from-catalog-2 branch from ead4d7b to 3f07ba1 Compare October 24, 2025 16:18
@edgarrmondragon edgarrmondragon force-pushed the feat/streams-from-catalog-2 branch from 3f07ba1 to 4d6bcf6 Compare November 6, 2025 21:02
@edgarrmondragon edgarrmondragon added the Type/Tap Singer taps label Nov 6, 2025
@edgarrmondragon edgarrmondragon force-pushed the feat/streams-from-catalog-2 branch from 4d6bcf6 to be2d51f Compare November 7, 2025 22:05
Rafał Krupiński and others added 2 commits April 24, 2026 13:05
…to `Stream` objects, without discovery

This clarifies whether the tap should perform discovery or use the input catalog, the two being mutually exclusive.

Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
Co-authored-by: Rafal Krupinski <low.map7277@fastmail.com>
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>

Co-authored-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Co-authored-by: sourcery-ai[bot] <58596630+sourcery-ai[bot]@users.noreply.github.com>
@edgarrmondragon edgarrmondragon force-pushed the feat/streams-from-catalog-2 branch from c471a51 to 373758d Compare April 24, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Type/Tap Singer taps

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant