Skip to content

Commit 44441d7

Browse files
committed
update
1 parent 4ee5e0c commit 44441d7

16 files changed

Lines changed: 483 additions & 435 deletions

docs/Go-SDK/go-sdk-advanced-state-api-zh.md

Lines changed: 3 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@
5454

5555
**Keyed 与非 Keyed**
5656

57-
- **Keyed 状态**用于 **keyed 算子**:流按 key 分区(如 keyBy 之后)。运行时按 key 投递记录;每个 key 应有独立状态。可**一次**获取**工厂**(从 context、store 名称、keyGroup),再按**主键**(流 key)通过例如 `factory.NewKeyedValue(primaryKey, stateName)` 创建状态
57+
- **Keyed 状态**用于 **keyed 算子**:流按 key 分区(如 keyBy 之后)。运行时按 key 投递记录;每个 key 应有独立状态。可**一次**获取**工厂**(从 context、store 名称、keyGroup),再按**主键**(流 key)与 namespace 构造对应状态类型
5858
- **非 Keyed 状态**(ValueState、ListState 等)每个 store 存一个逻辑实体。在无 key 分区或维护单一全局状态时使用。
5959

6060
---
@@ -161,7 +161,7 @@
161161

162162
## 7. Keyed 状态 — 工厂与按 Key 实例
163163

164-
Keyed 状态用于 **keyed 算子**:流按 key 分区(如 keyBy)时,每个 key 在独立状态上处理。可**一次**获取**工厂**(从 context、store 名称与 **keyGroup**),再按**主键**(当前记录的流 key)创建状态,例如 `factory.NewKeyedValue(primaryKey, stateName)`
164+
Keyed 状态用于 **keyed 算子**:流按 key 分区(如 keyBy)时,每个 key 在独立状态上处理。可**一次**获取**工厂**(从 context、store 名称与 **keyGroup**),再按**主键**(当前记录的流 key)与 namespace 构造对应状态类型
165165

166166
状态按 **keyGroup**[]byte)和 **主键**(primaryKey,[]byte)组织。由 context、store 名称、keyGroup 创建工厂;再通过工厂方法按主键获取状态。
167167

@@ -172,7 +172,7 @@ Keyed API 对应 store 的 **ComplexKey**,有三个维度:
172172
| 术语 | 出现位置 | 含义 |
173173
|---------------|-----------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------|
174174
| **keyGroup** | 创建工厂时的参数 | **keyed 组**:标识该状态所属分区/组(如 `[]byte("counters")``[]byte("sessions")`)。同一 keyed 组 ⇒ 相同 keyGroup 字节。 |
175-
| **key** | 工厂方法中的 `primaryKey`(如 `NewKeyedValue(primaryKey, ...)``NewKeyedList(primaryKey, namespace)`| **流 key 的值**:分区流所用的 key,序列化为字节(如用户 ID、分区 key)。不同 primaryKey 对应不同状态。 |
175+
| **key** | 工厂方法中的 `primaryKey`(如 `NewKeyedList(primaryKey, namespace)`| **流 key 的值**:分区流所用的 key,序列化为字节(如用户 ID、分区 key)。不同 primaryKey 对应不同状态。 |
176176
| **namespace** | 工厂方法中的 `namespace`[]byte) | **有窗口时****窗口标识的字节**(如序列化的窗口边界或窗口 ID),状态按 key 与窗口隔离。**无窗口时**:传**空字节**`nil``[]byte{}`)。 |
177177

178178
**小结****keyGroup** = keyed 组标识;**key**(primaryKey)= 流 key 值;**namespace** = 使用窗口时为窗口字节,否则为空。
@@ -192,7 +192,6 @@ Keyed API 对应 store 的 **ComplexKey**,有三个维度:
192192

193193
| 工厂 | 方法 | 返回 |
194194
|-----------------------------------|-----------------------------------------------------------------------------------------------------|-----------------------------------------|
195-
| KeyedValueStateFactory[V] | `NewKeyedValue(primaryKey []byte, stateName string) (*KeyedValueState[V], error)` | 每个 (primaryKey, stateName) 一个 value 状态。 |
196195
| KeyedListStateFactory[V] | `NewKeyedList(primaryKey []byte, namespace []byte) (*KeyedListState[V], error)` | 每个 (primaryKey, namespace) 一个 list 状态。 |
197196
| KeyedMapStateFactory[MK,MV] | `NewKeyedMap(primaryKey []byte, mapName string) (*KeyedMapState[MK,MV], error)` | 每个 (primaryKey, mapName) 一个 map 状态。 |
198197
| KeyedPriorityQueueStateFactory[V] | `NewKeyedPriorityQueue(primaryKey []byte, namespace []byte) (*KeyedPriorityQueueState[V], error)` | 每个 (primaryKey, namespace) 一个 PQ 状态。 |

docs/Go-SDK/go-sdk-advanced-state-api.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ The advanced state API offers typed views over a single logical store. Pick the
5454

5555
**Keyed vs non-keyed**
5656

57-
- **Keyed state** is for **keyed operators**: streams partitioned by a key (e.g. after keyBy). The runtime delivers records per key; each key should have isolated state. Obtain a **factory** once (from context, store name, and keyGroup), then create state **per primary key** (the stream key) via e.g. `factory.NewKeyedValue(primaryKey, stateName)`.
57+
- **Keyed state** is for **keyed operators**: streams partitioned by a key (e.g. after keyBy). The runtime delivers records per key; each key should have isolated state. Obtain a **factory** once (from context, store name, and keyGroup), then construct the corresponding state type per **primary key** (stream key) and namespace.
5858
- **Non-keyed state** (ValueState, ListState, etc.) stores one logical entity per store. Use it when there is no key partitioning or you maintain a single global state.
5959

6060
---
@@ -172,7 +172,7 @@ The Keyed API maps onto the store’s **ComplexKey** with three dimensions:
172172
| Term | Where it appears | Meaning |
173173
|---------------|----------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
174174
| **keyGroup** | Argument when creating the factory | The **keyed group**: identifies which keyed partition/group this state belongs to. Use one keyGroup per logical “keyed group” or state kind (e.g. `[]byte("counters")`, `[]byte("sessions")`). Same keyed group ⇒ same keyGroup bytes. |
175-
| **key** | `primaryKey` in factory methods (e.g. `NewKeyedValue(primaryKey, ...)`, `NewKeyedList(primaryKey, namespace)`) | The **value of the stream key**: the key that partitioned the stream, serialized as bytes (e.g. user ID, partition key). Each distinct primaryKey gets isolated state. |
175+
| **key** | `primaryKey` in factory methods (e.g. `NewKeyedList(primaryKey, namespace)`) | The **value of the stream key**: the key that partitioned the stream, serialized as bytes (e.g. user ID, partition key). Each distinct primaryKey gets isolated state. |
176176
| **namespace** | `namespace` ([]byte) in factory methods that take it | **With window functions**: use the **window identifier as bytes** (e.g. serialized window bounds or window ID) so state is scoped per key *and* per window. **Without windows**: pass **empty bytes** (`nil` or `[]byte{}`). |
177177

178178
**Summary:** **keyGroup** = keyed group identifier; **key** (primaryKey) = stream key value; **namespace** = window bytes when using windows, otherwise empty.
@@ -192,7 +192,6 @@ The Keyed API maps onto the store’s **ComplexKey** with three dimensions:
192192

193193
| Factory | Method | Returns |
194194
|-----------------------------------|-----------------------------------------------------------------------------------------------------|------------------------------------------------|
195-
| KeyedValueStateFactory[V] | `NewKeyedValue(primaryKey []byte, stateName string) (*KeyedValueState[V], error)` | One value state per (primaryKey, stateName). |
196195
| KeyedListStateFactory[V] | `NewKeyedList(primaryKey []byte, namespace []byte) (*KeyedListState[V], error)` | List state per (primaryKey, namespace). |
197196
| KeyedMapStateFactory[MK,MV] | `NewKeyedMap(primaryKey []byte, mapName string) (*KeyedMapState[MK,MV], error)` | Map state per (primaryKey, mapName). |
198197
| KeyedPriorityQueueStateFactory[V] | `NewKeyedPriorityQueue(primaryKey []byte, namespace []byte) (*KeyedPriorityQueueState[V], error)` | PQ state per (primaryKey, namespace). |

docs/Python-SDK/python-sdk-advanced-state-api-zh.md

Lines changed: 33 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -79,25 +79,45 @@
7979
| 概念 | API 参数 | 含义 |
8080
|---------------|------------------------------------------|---------------------------------------------------------|
8181
| **key_group** | 创建工厂时的 `key_group` | **keyed 组**:标识该状态所属分区/组(如一组 “counters”,另一组 “sessions”)。 |
82-
| **key** | 工厂方法参数(如 `new_keyed_value(primary_key)`| 当前记录的**流 key 的值**(如用户 ID、分区 key)。不同 key 对应不同状态。 |
82+
| **key** | 构造状态时的 `primary_key`(如 `KeyedValueState(factory, primary_key, namespace)`| 当前记录的**流 key 的值**(如用户 ID、分区 key)。不同 key 对应不同状态。 |
8383
| **namespace** | 创建工厂时的 `namespace`(bytes) | **有窗口时****窗口标识的 bytes****无窗口时****空 bytes**(如 `b""`)。 |
8484

8585
### 4.2 Keyed 工厂构造方法一览
8686

8787
| 工厂 | 带 codec | AutoCodec |
8888
|--------------------------------|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------|
89-
| KeyedValueStateFactory | `KeyedValueStateFactory.from_context(ctx, store_name, namespace, key_group, value_codec)` | `KeyedValueStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, value_type=None)` |
90-
| KeyedListStateFactory | `KeyedListStateFactory.from_context(ctx, store_name, namespace, key_group, value_codec)` | `KeyedListStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, value_type=None)` |
91-
| KeyedMapStateFactory | `KeyedMapStateFactory.from_context(ctx, store_name, namespace, key_group, key_codec, value_codec)` | `KeyedMapStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, value_codec)` |
92-
| KeyedPriorityQueueStateFactory | `KeyedPriorityQueueStateFactory.from_context(ctx, store_name, namespace, key_group, item_codec)` | `KeyedPriorityQueueStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, item_type=None)` |
93-
| KeyedAggregatingStateFactory | `KeyedAggregatingStateFactory.from_context(ctx, store_name, namespace, key_group, acc_codec, agg_func)` | `KeyedAggregatingStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, agg_func, acc_type=None)` |
94-
| KeyedReducingStateFactory | `KeyedReducingStateFactory.from_context(ctx, store_name, namespace, key_group, value_codec, reduce_func)` | `KeyedReducingStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, reduce_func, value_type=None)` |
89+
| KeyedValueStateFactory | `KeyedValueStateFactory.from_context(ctx, store_name, key_group, value_codec)` | `KeyedValueStateFactory.from_context_auto_codec(ctx, store_name, key_group, value_type=None)` |
90+
| KeyedListStateFactory | `KeyedListStateFactory.from_context(ctx, store_name, key_group, value_codec)` | `KeyedListStateFactory.from_context_auto_codec(ctx, store_name, key_group, value_type=None)` |
91+
| KeyedMapStateFactory | `KeyedMapStateFactory.from_context(ctx, store_name, key_group, map_key_codec, map_value_codec)` | `KeyedMapStateFactory.from_context_auto_codec(ctx, store_name, key_group, map_key_type=None, map_value_type=None)` |
92+
| KeyedPriorityQueueStateFactory | `KeyedPriorityQueueStateFactory.from_context(ctx, store_name, key_group, item_codec)` | `KeyedPriorityQueueStateFactory.from_context_auto_codec(ctx, store_name, key_group, item_type=None)` |
93+
| KeyedAggregatingStateFactory | `KeyedAggregatingStateFactory.from_context(ctx, store_name, key_group, acc_codec, agg_func)` | `KeyedAggregatingStateFactory.from_context_auto_codec(ctx, store_name, key_group, agg_func, acc_type=None)` |
94+
| KeyedReducingStateFactory | `KeyedReducingStateFactory.from_context(ctx, store_name, key_group, value_codec, reduce_func)` | `KeyedReducingStateFactory.from_context_auto_codec(ctx, store_name, key_group, reduce_func, value_type=None)` |
9595

9696
也可使用 Context 的 `ctx.getOrCreateKeyed*Factory(...)` 方法,其内部会委托给上述构造方法。
9797

9898
### 4.3 KeyedValueState
9999

100-
KeyedValueState 只需 **value codec**,不要求有序。工厂创建状态:`factory.new_keyed_value(primary_key, state_name="")`,得到 `KeyedValueState[V]`。状态方法:`update(value)``value()`(返回 `Optional[V]`)、`clear()`。主键由创建时传入的 `primary_key`(bytes)固定。
100+
KeyedValueState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace)。工厂:`KeyedValueStateFactory.from_context(ctx, store_name, key_group, value_codec)``from_context_auto_codec(ctx, store_name, key_group, value_type=None)`。构造状态:`KeyedValueState(factory, primary_key, namespace)`,其中 namespace 可为 `state_name.encode("utf-8")`。状态方法:`update(value)``value()`(返回 `(value, found)`)、`clear()`
101+
102+
### 4.4 KeyedListState
103+
104+
KeyedListState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace),创建列表时再传入 **key****namespace**。工厂:`KeyedListStateFactory.from_context(ctx, store_name, key_group, value_codec)``from_context_auto_codec(ctx, store_name, key_group, value_type=None)`。创建列表:`factory.new_keyed_list(key, namespace)`,得到 `KeyedListState[V]`。状态方法:`add(value)``add_all(values)``get()`(返回 `List[V]`)、`update(values)`(先清空再整体写入)、`clear()`
105+
106+
### 4.5 KeyedAggregatingState
107+
108+
KeyedAggregatingState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace)。工厂:`KeyedAggregatingStateFactory.from_context(ctx, store_name, key_group, acc_codec, agg_func)``from_context_auto_codec(ctx, store_name, key_group, agg_func, acc_type=None)`。创建状态:`factory.new_aggregating_state(primary_key, state_name="")`,得到绑定到该 (primary_key, namespace=state_name) 的 `KeyedAggregatingState[T, ACC, R]`。状态方法:`add(value)`(向当前状态的 accumulator 合并)、`get()`(返回 `(result, found)`)、`clear()`
109+
110+
### 4.6 KeyedMapState
111+
112+
KeyedMapState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace),且 map key 的 codec 必须有序。工厂:`KeyedMapStateFactory.from_context(ctx, store_name, key_group, map_key_codec, map_value_codec)``from_context_auto_codec(ctx, store_name, key_group, map_key_type=None, map_value_type=None)`。创建 map:`factory.new_keyed_map(primary_key, map_name)`(map_name 必填,转为 namespace),得到 `KeyedMapState[MK, MV]`。状态方法:`put(map_key, value)``get(map_key)`(返回 `(value, found)`)、`delete(map_key)``clear()`(按前缀删除本 map 全部条目)、`all()`(迭代 `(map_key, value)`)。
113+
114+
### 4.7 KeyedPriorityQueueState
115+
116+
KeyedPriorityQueueState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace),元素 codec 必须有序。工厂:`KeyedPriorityQueueStateFactory.from_context(ctx, store_name, key_group, item_codec)``from_context_auto_codec(ctx, store_name, key_group, item_type=None)`。创建队列:`factory.new_keyed_priority_queue(primary_key, namespace)`(primary_key 与 namespace 均必填,bytes),得到 `KeyedPriorityQueueState[V]`。状态方法:`add(value)``peek()`(返回 `(min_element, found)`)、`poll()`(取出并返回最小元素)、`clear()`(按前缀删除全部)、`all()`(按序迭代所有元素)。
117+
118+
### 4.8 KeyedReducingState
119+
120+
KeyedReducingState 与 Go SDK 一致:工厂仅需 `key_group`(无 namespace)。工厂:`KeyedReducingStateFactory.from_context(ctx, store_name, key_group, value_codec, reduce_func)``from_context_auto_codec(ctx, store_name, key_group, reduce_func, value_type=None)`。创建状态:`factory.new_reducing_state(primary_key, namespace)`(两者必填,bytes),得到 `KeyedReducingState[V]`。状态方法:`add(value)`(与当前值经 reduce_func 合并后写入)、`get()`(返回 `(value, found)`)、`clear()`
101121

102122
---
103123

@@ -127,19 +147,19 @@ class CounterProcessor(FSProcessorDriver):
127147

128148
```python
129149
from fs_api import FSProcessorDriver, Context
130-
from fs_api_advanced import KeyedValueStateFactory
150+
from fs_api_advanced import KeyedValueState, KeyedValueStateFactory
131151

132152
class KeyedCounterProcessor(FSProcessorDriver):
133153
def init(self, ctx: Context, config: dict):
134154
self._factory = KeyedValueStateFactory.from_context_auto_codec(
135-
ctx, "counters", b"", b"by_key", value_type=int
155+
ctx, "counters", b"by_key", value_type=int
136156
)
137157

138158
def process(self, ctx: Context, source_id: int, data: bytes):
139159
primary_key = data[:8]
140-
state = self._factory.new_keyed_value(primary_key, "count")
141-
cur = state.value()
142-
if cur is None:
160+
state = KeyedValueState(self._factory, primary_key, "count".encode("utf-8"))
161+
cur, found = state.value()
162+
if not found:
143163
cur = 0
144164
state.update(cur + 1)
145165
ctx.emit(str(cur + 1).encode(), 0)

0 commit comments

Comments
 (0)