feat(targets): Use the presence of _sdc_deleted_at to flag records for deletion in hard-delete mode#3450
feat(targets): Use the presence of _sdc_deleted_at to flag records for deletion in hard-delete mode#3450edgarrmondragon wants to merge 1 commit into
_sdc_deleted_at to flag records for deletion in hard-delete mode#3450Conversation
Reviewer's GuideImplements hard-delete handling for LOG_BASED replication by treating records with Sequence diagram for hard-delete handling in SQL sink batch processingsequenceDiagram
actor TapProcess
participant SQLSink
participant SQLConnector
participant Database
TapProcess->>SQLSink: process_batch(context)
SQLSink->>SQLSink: records = context["records"]
SQLSink->>SQLSink: _split_records_for_hard_delete(records)
SQLSink-->>SQLSink: records_to_delete, records_to_insert
alt hard_delete enabled and key_properties set and records_to_delete not empty
SQLSink->>SQLSink: hard_delete_records(records_to_delete)
SQLSink->>SQLConnector: delete_by_key(full_table_name, key_columns, key_values)
SQLConnector->>SQLConnector: build DELETE statement and bind values
SQLConnector->>Database: execute DELETE ... WHERE (key_conditions)
Database-->>SQLConnector: rowcount
SQLConnector-->>SQLSink: number_deleted
SQLSink->>SQLSink: log deletion count
end
alt records_to_insert not empty
SQLSink->>SQLSink: bulk_insert_records(full_table_name, schema, records_to_insert)
SQLSink->>Database: INSERT rows
Database-->>SQLSink: insert result
end
TapProcess-->>TapProcess: batch complete
Updated class diagram for SQL sink hard-delete supportclassDiagram
class Sink {
config: dict
_remove_sdc_metadata_from_record(record: dict) void
}
class SQLSink {
config: dict
key_properties: list~str~
full_table_name: str
schema: dict
soft_delete_column_name: str
logger
connector: SQLConnector
process_batch(context: dict) void
bulk_insert_records(full_table_name: str, schema: dict, records: Iterable~dict~) void
_split_records_for_hard_delete(records: Iterable~dict~) tuple~list~dict~~ list~dict~~
hard_delete_records(records: Sequence~dict~) int
conform_name(name: str, object_type: str) str
conform_record(record: dict) dict
}
class SQLConnector {
delete_by_key(full_table_name: str, key_columns: Sequence~str~, key_values: Sequence~dict~) int
_connect() Connection
}
Sink <|-- SQLSink
SQLSink *-- SQLConnector
%% Metadata handling behavior
class RecordMetadataBehavior {
+handles_sdc_deleted_at_based_on_hard_delete_flag
}
Sink ..> RecordMetadataBehavior
SQLSink ..> RecordMetadataBehavior
File-Level Changes
Assessment against linked issues
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
_sdc_deleted_at to flag records for deletion in hard-delete mode_sdc_deleted_at to flag records for deletion in hard-delete mode
Documentation build overview
Show files changed (3 files in total): 📝 3 modified | ➕ 0 added | ➖ 0 deleted
|
CodSpeed Performance ReportMerging this PR will not alter performanceComparing Summary
Footnotes |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3450 +/- ##
==========================================
+ Coverage 94.14% 94.17% +0.03%
==========================================
Files 70 70
Lines 5785 5820 +35
Branches 716 724 +8
==========================================
+ Hits 5446 5481 +35
Misses 236 236
Partials 103 103
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
86d110c to
f06af41
Compare
…for deletion in hard-delete mode Signed-off-by: Edgar Ramírez Mondragón <edgarrm358@gmail.com>
f06af41 to
051b3d5
Compare
Related
hard_deletecapability does not consider the presence of non-null_sdc_deleted_atvalues generated during LOG_BASED replication #3444Summary by Sourcery
Implement hard-delete handling for log-based replication using the
_sdc_deleted_atflag and expose a key-based delete API for SQL connectors.New Features:
_sdc_deleted_atduring LOG_BASED replication whenhard_deleteis enabled.delete_by_keymethod to delete rows by primary key across SQL targets.Enhancements:
_sdc_deleted_atmetadata on records whenhard_deleteis enabled so it can be used to drive hard deletes.Tests: