diff --git a/ticdc/ticdc-csv.md b/ticdc/ticdc-csv.md index fcb578cf41ee..960005bc6b04 100644 --- a/ticdc/ticdc-csv.md +++ b/ticdc/ticdc-csv.md @@ -29,6 +29,7 @@ null = '\N' include-commit-ts = true binary-encoding-method = 'base64' output-old-value = false +output-field-header = false ``` ## 数据保存的事务性约束 @@ -52,6 +53,12 @@ CSV 文件中,单行的每一列定义如下: - 第五列:`is-update`,该列仅在 `output-old-value` 为 true 时存在,用于标识该行变更来自 Update 事件(值为 true),还是来自 Insert/Delete 事件(值为 false)。 - 第六列至最后一列:变更数据的列,可为一列或多列。 +当配置中 `output-field-header = true` 时,CSV 文件将包含一个表头行,表头行的列名如下: + +| 第一列 | 第二列 | 第三列 | 第四列(可选) | 第五列(可选) | 第六列 | ... | 最后一列 | +| --- | --- | --- | --- | --- | --- | --- | --- | +| `ticdc-meta$operation` | `ticdc-meta$table` | `ticdc-meta$schema` | `ticdc-meta$commit-ts` | `ticdc-meta$is-update` | 涉及数据变更的第一列的列名 | ... | 涉及数据变更的最后一列的列名 | + 假设某张表 `hr.employee` 的定义如下: ```sql @@ -86,6 +93,19 @@ CREATE TABLE `employee` ( "I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing" ``` +当配置中 `include-commit-ts = true` 且 `output-old-value = true` 且 `output-field-header = true` 时,该表上的 DML 事件以 CSV 格式存储后如下所示: + +``` +ticdc-meta$operation,ticdc-meta$table,ticdc-meta$schema,ticdc-meta$commit-ts,ticdc-meta$is-update,Id,LastName,FirstName,HireDate,OfficeLocation +"I","employee","hr",433305438660591626,false,101,"Smith","Bob","2014-06-04","New York" +"D","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Shanghai" +"I","employee","hr",433305438660591627,true,101,"Smith","Bob","2015-10-08","Los Angeles" +"D","employee","hr",433305438660591629,false,101,"Smith","Bob","2017-03-13","Dallas" +"I","employee","hr",433305438660591630,false,102,"Alex","Alice","2017-03-14","Shanghai" +"D","employee","hr",433305438660591630,true,102,"Alex","Alice","2017-03-14","Beijing" +"I","employee","hr",433305438660591630,true,102,"Alex","Alice","2018-06-15","Beijing" +``` + ## 数据类型映射 | MySQL 类型 | CSV 类型 | 示例 | 描述 |