Skip to content

feat(taps): Allow REST streams to ignore certain errors and continue on to the next stream/context#3517

Open
edgarrmondragon wants to merge 23 commits into
mainfrom
feat/safely-ignore-errors
Open

feat(taps): Allow REST streams to ignore certain errors and continue on to the next stream/context#3517
edgarrmondragon wants to merge 23 commits into
mainfrom
feat/safely-ignore-errors

Conversation

@edgarrmondragon
Copy link
Copy Markdown
Collaborator

@edgarrmondragon edgarrmondragon commented Feb 26, 2026

SSIA

Summary by Sourcery

Add support for treating certain REST stream errors as ignorable so affected requests are skipped without failing the sync.

Enhancements:

  • Log a warning and skip further processing when an ignorable error occurs during REST stream record requests.

Tests:

  • Add tests covering propagation of IgnorableAPIError from REST response validation and verifying that REST streams skip records and log appropriately on ignorable errors.

Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
…on to the next stream/context

Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
@edgarrmondragon edgarrmondragon self-assigned this Feb 26, 2026
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Feb 26, 2026

Reviewer's Guide

Adds support for ignorable REST stream errors that skip the current request/stream while allowing the overall sync to continue, and introduces tests to validate this behavior and its logging.

Sequence diagram for REST stream request handling with ignorable errors

sequenceDiagram
    actor TapRunner
    participant RESTStream
    participant Paginator
    participant HTTPClient as decorated_request
    participant Logger

    TapRunner->>RESTStream: request_records(context)
    loop pages for context
        RESTStream->>Paginator: get_next(prepared_request, context)
        Paginator-->>RESTStream: next_page_token
        RESTStream->>HTTPClient: decorated_request(prepared_request, context)
        HTTPClient-->>RESTStream: IgnorableSyncError
        RESTStream->>Logger: warning(Skipping request due to ignorable error)
        RESTStream-->>TapRunner: break page loop for this context
    end
    TapRunner->>TapRunner: continue with next stream/context
Loading

Class diagram for REST stream error handling extensions

classDiagram
    class RESTStream {
        logger
        extra_retry_statuses
        validate_response(response)
        request_records(context)
        parse_response(response)
        update_sync_costs(prepared_request, response, context)
    }

    class IgnorableAPIError {
    }

    class IgnorableSyncError {
    }

    RESTStream ..> IgnorableAPIError : may raise
    RESTStream ..> IgnorableSyncError : caught in

    note for RESTStream "validate_response now documents IgnorableAPIError; request_records catches IgnorableSyncError to skip current request/stream and continue sync"
Loading

File-Level Changes

Change Details Files
Add an IgnorableAPIError-based stream and tests to verify both propagation from validate_response and silent skipping in request_records with appropriate logging.
  • Extend REST failure tests to import IgnorableAPIError and typing for requests_mock.
  • Define IgnorableStream that raises IgnorableAPIError on 404 responses in validate_response.
  • Add a unit test ensuring IgnorableAPIError raised in validate_response is not swallowed.
  • Add an integration-style test that mocks a 404 response, asserts request_records yields no records, and verifies a warning log about skipping the request due to an ignorable error.
tests/core/rest/test_failure.py
Modify RESTStream to recognize ignorable errors during requests, document IgnorableAPIError in the validate_response contract, and skip processing when such errors occur.
  • Update validate_response docstring to document IgnorableAPIError as a possible outcome when a request should be silently skipped.
  • Wrap decorated_request call in request_records with a try/except for IgnorableSyncError to catch ignorable errors.
  • On catching an ignorable error, log a warning indicating the request is being skipped and break out of the loop so no records are yielded for that request.
singer_sdk/streams/rest.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@read-the-docs-community
Copy link
Copy Markdown

read-the-docs-community Bot commented Feb 26, 2026

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • In request_records, you catch IgnorableSyncError but the rest of the change (including tests) uses IgnorableAPIError; ensure the exception type is consistent and properly imported so ignorable errors are actually handled as intended.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `request_records`, you catch `IgnorableSyncError` but the rest of the change (including tests) uses `IgnorableAPIError`; ensure the exception type is consistent and properly imported so ignorable errors are actually handled as intended.

## Individual Comments

### Comment 1
<location path="singer_sdk/streams/rest.py" line_range="236-240" />
<code_context>
             FatalAPIError: If the request is not retriable.
             RetriableAPIError: If the request is retriable.
-        """
+            IgnorableAPIError: If the request should be silently skipped.
+        """  # noqa: DOC502
         if (
             response.status_code in self.extra_retry_statuses
</code_context>
<issue_to_address>
**suggestion:** Clarify whether "silently skipped" aligns with the new warning log on ignorable errors.

The docstring says `IgnorableAPIError` should be "silently skipped", but `request_records` now logs a warning for `IgnorableSyncError`. If these represent the same class of error, consider either lowering the log level (info/debug) or rephrasing the docstring so it no longer implies the skip is silent to operators.

```suggestion
        Raises:
            FatalAPIError: If the request is not retriable.
            RetriableAPIError: If the request is retriable.
            IgnorableSyncError: If the request is skipped without raising an error
                to the caller (the skip is still logged).
        """  # noqa: DOC502
```
</issue_to_address>

### Comment 2
<location path="singer_sdk/streams/rest.py" line_range="479-485" />
<code_context>
                     next_page_token=paginator.current_value,
                 )
-                resp = decorated_request(prepared_request, context)
+                try:
+                    resp = decorated_request(prepared_request, context)
+                except IgnorableSyncError as e:
+                    self.logger.warning(
+                        "Skipping request due to ignorable error: %s", e
+                    )
+                    break
                 request_counter.increment()
                 self.update_sync_costs(prepared_request, resp, context)
</code_context>
<issue_to_address>
**question (bug_risk):** Re-evaluate using `break` on ignorable errors, which stops pagination for the remaining pages.

This handling stops pagination for the entire stream as soon as an `IgnorableSyncError` occurs. Please confirm whether the intent is to abort the rest of the loop, or whether we should instead skip just this request/page and continue with the next page/token.
</issue_to_address>

### Comment 3
<location path="tests/core/rest/test_failure.py" line_range="262-271" />
<code_context>
+        stream.validate_response(fake_response)
+
+
+def test_request_records_skips_on_ignorable_error(
+    requests_mock: requests_mock_module.Mocker,
+    rest_tap,
+    caplog: pytest.LogCaptureFixture,
+):
+    """request_records yields nothing and logs a warning on IgnorableAPIError.
+
+    No exception should propagate; the stream is silently skipped.
+    """
+    requests_mock.get("https://example.com/dummy", status_code=404, reason="Not Found")
+
+    stream = IgnorableStream(rest_tap)
+
+    with caplog.at_level(logging.WARNING, logger=stream.logger.name):
+        records = list(stream.request_records(None))
+
+    assert records == []
+    assert any(
+        "Skipping request due to ignorable error" in record.message
+        for record in caplog.records
</code_context>
<issue_to_address>
**suggestion (testing):** Consider an additional test for ignorable errors occurring after some records have already been emitted (pagination case)

You’re already covering the case where the first request is ignorable. Since `request_records` is commonly used with paginated streams, please add a test where an initial page returns records and a subsequent page hits an ignorable error (e.g., 404). The test should assert that the records from the successful page are returned, the warning is logged, and pagination stops after the ignorable error.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread singer_sdk/streams/rest.py Outdated
Comment thread singer_sdk/streams/rest.py
Comment thread tests/core/rest/test_failure.py
Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Feb 26, 2026

Merging this PR will not alter performance

✅ 8 untouched benchmarks


Comparing feat/safely-ignore-errors (f419d3b) with main (bad876d)

Open in CodSpeed

Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
@edgarrmondragon edgarrmondragon added Type/Tap Singer taps HTTP HTTP based taps and targets such (REST, XML, etc.) labels Feb 26, 2026
@edgarrmondragon edgarrmondragon added this to the v0.54 milestone Feb 26, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 26, 2026

Codecov Report

❌ Patch coverage is 91.54930% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.86%. Comparing base (bad876d) to head (f419d3b).

Files with missing lines Patch % Lines
singer_sdk/streams/_result.py 83.33% 5 Missing ⚠️
singer_sdk/tap_base.py 91.66% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #3517   +/-   ##
=======================================
  Coverage   93.85%   93.86%           
=======================================
  Files          73       74    +1     
  Lines        5907     5965   +58     
  Branches      725      735   +10     
=======================================
+ Hits         5544     5599   +55     
- Misses        270      274    +4     
+ Partials       93       92    -1     
Flag Coverage Δ
core 82.58% <91.54%> (+0.30%) ⬆️
end-to-end 75.28% <61.97%> (-0.15%) ⬇️
optional-components 42.63% <29.57%> (-0.12%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Base automatically changed from exception-hierarchy to main February 27, 2026 17:36
@pre-commit-ci pre-commit-ci Bot requested review from a team as code owners February 27, 2026 17:36
@ReubenFrankel
Copy link
Copy Markdown
Contributor

This will be nice to replace our custom ResumableAPIError implementation that we have in a couple of taps with! 😁 (thanks for following up even after I forgot to open an issue)

@edgarrmondragon
Copy link
Copy Markdown
Collaborator Author

This will be nice to replace our custom ResumableAPIError implementation that we have in a couple of taps with! 😁 (thanks for following up even after I forgot to open an issue)

Do you have handy examples of those implementations? I'd like to make sure this would actually let us remove the customizations.

@ReubenFrankel
Copy link
Copy Markdown
Contributor

Also, IMO it would be good to be able to run a tap in a generic "skip errors" mode and have it report the errors at the end of the sync. That would ensure it gets all the data for all streams where possible, rather than exiting on the first encountered error. It's pretty common that for a client, when their pipeline fails and we narrow down to an issue in a particular stream, they ask us whether data for other the streams was synced - the answer is usually a "unfortunately not" (unless we already added explicit handling of known-resumable errors).

@edgarrmondragon
Copy link
Copy Markdown
Collaborator Author

edgarrmondragon commented Mar 4, 2026

run a tap in a generic "skip errors" mode

I think it might make sense for this mode to actually be the only mode.

I'm trying to imagine what problems that would create, but as long as we capture the errors with contextual information (stream, "stage", etc.) and exit with the right code, it feels like it's the correct approach.

UPDATE:

…ror (#3545)

Stub

## Summary by Sourcery

Track per-stream sync results and allow tap runs to continue syncing
remaining streams after individual failures while reporting an aggregate
outcome.

New Features:
- Introduce a SyncResult enum to represent and combine per-stream sync
outcomes and map them to process exit codes.
- Return an aggregate SyncResult from Tap.sync_all and use it to set the
tap process exit code instead of always exiting successfully.

Bug Fixes:
- Ensure errors in child streams mark the parent stream as partially
successful instead of aborting remaining children and records.
- Convert unexpected exceptions during stream sync into lifecycle abort
exceptions so they are consistently handled and logged.

Enhancements:
- Log a one-line sync outcome per stream and improve error log messages
during sync to include exception details.
- Extend parent/child stream typing with Context and Record aliases and
apply override annotations for better type safety.
- Adjust typing-extensions dependency marker for Python 3.13
compatibility.

Tests:
- Add unit tests and snapshot tests covering SyncResult behavior and tap
behavior when some streams fail while others continue syncing, including
incremental and parent/child scenarios.

---------

Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
Signed-off-by: Edgar Ramírez-Mondragón <edgarrm358@gmail.com>
@edgarrmondragon edgarrmondragon modified the milestones: v0.54, v0.55 May 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

HTTP HTTP based taps and targets such (REST, XML, etc.) Type/Tap Singer taps

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: IgnorableAPIError to compliment RetriableAPIError and FatalAPIError

2 participants