Skip to content

log backup: update metafile format #21412

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions br/br-log-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,7 @@ Log backup generates the following types of files:
├── v1
│   ├── backupmeta
│   │   ├── ...
│   │   └── {resolved_ts}-{uuid}.meta
│   │   └── {flushTs}-{minDefaultTs}-{minTs}-{maxTs}.meta
│   ├── global_checkpoint
│   │   └── {store_id}.ts
│   └── {date}
Expand All @@ -122,7 +122,13 @@ Log backup generates the following types of files:

Explanation of the backup file directory structure:

- `backupmeta`: stores backup metadata. The `resolved_ts` in the filename indicates the backup progress, meaning that data before this TSO has been fully backed up. However, note that this TSO only reflects the progress of certain shards.
- `backupmeta` directory: stores backup metadata files. Starting from v8.5.3 and v9.0.0, the naming convention of these files changes from `{resolved_ts}-{uuid}.meta` to `{flushTs}-{minDefaultTs}-{minTs}-{maxTs}.meta`. The filename contains the following timestamp fields:
- `flushTs`: the timestamp when the backup file is periodically uploaded to the external storage. This value is obtained from PD and is globally unique.
- `minDefaultTs` (only applicable to Write CF files): the earliest transaction start time covered by this backup.
- `minTs` and `maxTs`: the minimum and maximum timestamps of all key-value data included in the backup file.

All these timestamps are encoded as fixed-length 16-digit hexadecimal strings, left-padded with zeros to ensure consistent length. This encoding design guarantees that filenames are naturally sorted in lexicographical order, making it efficient to perform batch listing and range filtering operations in external storage systems.

- `global_checkpoint`: represents the global backup progress. It records the latest point in time to which data can be restored using `br restore point`.
- `{date}/{hour}`: stores backup data for the corresponding date and hour. When cleaning up storage, always use `br log truncate` instead of manually deleting data. This is because the metadata references the data in this directory, and manual deletion might lead to restore failures or data inconsistencies after restore.

Expand All @@ -133,9 +139,9 @@ The following is an example:
├── v1
│   ├── backupmeta
│   │   ├── ...
│   │   ├── 435213818858112001-e2569bda-a75a-4411-88de-f469b49d6256.meta
│   │   ├── 435214043785779202-1780f291-3b8a-455e-a31d-8a1302c43ead.meta
│   │   └── 435214443785779202-224f1408-fff5-445f-8e41-ca4fcfbd2a67.meta
│   │   ├── 060c4bc7b0cdd582-06097a780d1ba138-060ab960016d2f00-060c0b9e47d4787b.meta
│   │   ├── 06123bc6a0cdd591-060c3d24585be000-060c4453954a4000-060c4bc7b0cdcfa4.meta
│   │   └── 063c2ac1c0cdd5c3-0609d2e6b3bcb064-060ab960016d2f84-060c0b9e47d47a77.meta
│   ├── global_checkpoint
│   │   ├── 1.ts
│   │   ├── 2.ts
Expand Down