From 6bb8123a0969e983736c4635e88ef18c99340150 Mon Sep 17 00:00:00 2001 From: nhsmw Date: Tue, 10 Jun 2025 17:07:26 +0800 Subject: [PATCH 1/4] Update ticdc-csv.md --- ticdc/ticdc-csv.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/ticdc/ticdc-csv.md b/ticdc/ticdc-csv.md index fcb578cf41ee..0e9b92951594 100644 --- a/ticdc/ticdc-csv.md +++ b/ticdc/ticdc-csv.md @@ -29,6 +29,7 @@ null = '\N' include-commit-ts = true binary-encoding-method = 'base64' output-old-value = false +output-field-header = false ``` ## 数据保存的事务性约束 @@ -52,6 +53,10 @@ CSV 文件中,单行的每一列定义如下: - 第五列:`is-update`,该列仅在 `output-old-value` 为 true 时存在,用于标识该行变更来自 Update 事件(值为 true),还是来自 Insert/Delete 事件(值为 false)。 - 第六列至最后一列:变更数据的列,可为一列或多列。 +| column1 | column2 | column3 | column4(optional) | column5(optional) | column6 | ... |columnX | +| --- | --- | --- | --- | --- | --- | --- | --- | +| ticdc-meta$operation | ticdc-meta$table | ticdc-meta$schema | ticdc-meta$commit-ts | ticdc-meta$is-update | col1 | xxx | colX | + 假设某张表 `hr.employee` 的定义如下: ```sql @@ -86,6 +91,19 @@ CREATE TABLE `employee` ( "I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing" ``` +当配置中 `include-commit-ts = true` 且 `output-old-value = true` 且 `output-field-header = true` 时,该表上的 DML 事件以 CSV 格式存储后如下所示: + +``` +ticdc-meta$operation,ticdc-meta$table,ticdc-meta$schema,ticdc-meta$commit-ts,ticdc-meta$is-update,Id,LastName,FirstName,HireDate,OfficeLocation +"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York" +"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai" +"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles" +"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas" +"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai" +"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing" +"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing" +``` + ## 数据类型映射 | MySQL 类型 | CSV 类型 | 示例 | 描述 | From 6a25339c1c72ac7ff8cb147b136cb2acaa18ad3c Mon Sep 17 00:00:00 2001 From: nhsmw Date: Tue, 15 Jul 2025 17:13:54 +0800 Subject: [PATCH 2/4] Apply suggestions from code review Co-authored-by: Grace Cai --- ticdc/ticdc-csv.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/ticdc/ticdc-csv.md b/ticdc/ticdc-csv.md index 0e9b92951594..7b1b4089d2eb 100644 --- a/ticdc/ticdc-csv.md +++ b/ticdc/ticdc-csv.md @@ -52,10 +52,10 @@ CSV 文件中,单行的每一列定义如下: - 第四列:`commit ts`,即原始事务的 commit ts。该列为可选配置。 - 第五列:`is-update`,该列仅在 `output-old-value` 为 true 时存在,用于标识该行变更来自 Update 事件(值为 true),还是来自 Insert/Delete 事件(值为 false)。 - 第六列至最后一列:变更数据的列,可为一列或多列。 - -| column1 | column2 | column3 | column4(optional) | column5(optional) | column6 | ... |columnX | +当配置中 `output-field-header = true` 时,CSV 文件将包含一个表头行,表头行的列名如下: +| 第一列 | 第二列 | 第三列 | 第四列(可选) | 第五列(可选) | 第六列 | ... | 最后一列 | | --- | --- | --- | --- | --- | --- | --- | --- | -| ticdc-meta$operation | ticdc-meta$table | ticdc-meta$schema | ticdc-meta$commit-ts | ticdc-meta$is-update | col1 | xxx | colX | +| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | 涉及数据变更的第一列的列名 | ... | 涉及数据变更的最后一列的列名 | 假设某张表 `hr.employee` 的定义如下: From 05e3fd224897cf7480235be62a7e9ee1a9263d8c Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Tue, 15 Jul 2025 17:14:42 +0800 Subject: [PATCH 3/4] add blank lines --- ticdc/ticdc-csv.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ticdc/ticdc-csv.md b/ticdc/ticdc-csv.md index 7b1b4089d2eb..960005bc6b04 100644 --- a/ticdc/ticdc-csv.md +++ b/ticdc/ticdc-csv.md @@ -52,7 +52,9 @@ CSV 文件中,单行的每一列定义如下: - 第四列:`commit ts`,即原始事务的 commit ts。该列为可选配置。 - 第五列:`is-update`,该列仅在 `output-old-value` 为 true 时存在,用于标识该行变更来自 Update 事件(值为 true),还是来自 Insert/Delete 事件(值为 false)。 - 第六列至最后一列:变更数据的列,可为一列或多列。 + 当配置中 `output-field-header = true` 时,CSV 文件将包含一个表头行,表头行的列名如下: + | 第一列 | 第二列 | 第三列 | 第四列(可选) | 第五列(可选) | 第六列 | ... | 最后一列 | | --- | --- | --- | --- | --- | --- | --- | --- | | `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | 涉及数据变更的第一列的列名 | ... | 涉及数据变更的最后一列的列名 | From 76f059592edd591d504642b558832ce7b3d95d7d Mon Sep 17 00:00:00 2001 From: Grace Cai Date: Wed, 23 Jul 2025 14:12:37 +0800 Subject: [PATCH 4/4] Update ticdc/ticdc-csv.md --- ticdc/ticdc-csv.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ticdc/ticdc-csv.md b/ticdc/ticdc-csv.md index 960005bc6b04..89e17f82de29 100644 --- a/ticdc/ticdc-csv.md +++ b/ticdc/ticdc-csv.md @@ -29,7 +29,7 @@ null = '\N' include-commit-ts = true binary-encoding-method = 'base64' output-old-value = false -output-field-header = false +output-field-header = false # 从 v9.0.0 开始引入 ``` ## 数据保存的事务性约束