|
| 1 | +<!-- |
| 2 | +
|
| 3 | + Licensed to the Apache Software Foundation (ASF) under one |
| 4 | + or more contributor license agreements. See the NOTICE file |
| 5 | + distributed with this work for additional information |
| 6 | + regarding copyright ownership. The ASF licenses this file |
| 7 | + to you under the Apache License, Version 2.0 (the |
| 8 | + "License"); you may not use this file except in compliance |
| 9 | + with the License. You may obtain a copy of the License at |
| 10 | +
|
| 11 | + http://www.apache.org/licenses/LICENSE-2.0 |
| 12 | +
|
| 13 | + Unless required by applicable law or agreed to in writing, |
| 14 | + software distributed under the License is distributed on an |
| 15 | + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| 16 | + KIND, either express or implied. See the License for the |
| 17 | + specific language governing permissions and limitations |
| 18 | + under the License. |
| 19 | +
|
| 20 | +--> |
| 21 | + |
| 22 | +# Python SDK — Advanced State API |
| 23 | + |
| 24 | +This document describes the **high-level state API** for the Python SDK: typed state abstractions (ValueState, ListState, MapState, etc.) built on top of the low-level KvStore, with serialization via **codecs** and optional **keyed state** per primary key. The design aligns with the [Go SDK Advanced State API](../Go-SDK/go-sdk-guide.md#7-advanced-state-api). |
| 25 | + |
| 26 | +--- |
| 27 | + |
| 28 | +## 1. Overview |
| 29 | + |
| 30 | +Use the advanced state API when you need structured state (single value, list, map, priority queue, aggregation, reduction) without manual byte encoding or key layout. You can create state either from **Context** (e.g. `ctx.getOrCreateValueState(...)`) or via **type-level constructors** on the state class (recommended for clarity and reuse, same pattern as the Go SDK). |
| 31 | + |
| 32 | +--- |
| 33 | + |
| 34 | +## 2. Creating State: Two Ways |
| 35 | + |
| 36 | +### 2.1 From Context (getOrCreate\*) |
| 37 | + |
| 38 | +`Context` defines methods such as `getOrCreateValueState(store_name, codec)`, `getOrCreateValueStateAutoCodec(store_name)`, and the same pattern for ListState, MapState, PriorityQueueState, AggregatingState, ReducingState, and all Keyed\* factories. The runtime implementation delegates to the type-level `from_context` / `from_context_auto_codec` methods below. |
| 39 | + |
| 40 | +### 2.2 From the state type (recommended, same as Go SDK) |
| 41 | + |
| 42 | +Each state type and keyed factory provides: |
| 43 | + |
| 44 | +- **With codec:** `XxxState.from_context(ctx, store_name, codec, ...)` — you pass the codec(s). |
| 45 | +- **AutoCodec:** `XxxState.from_context_auto_codec(ctx, store_name)` or with optional type hint — the SDK uses a default codec (e.g. `PickleCodec`, or ordered codecs for map key / PQ element where required). |
| 46 | + |
| 47 | +State instances are lightweight; you may create them per message in `process` or cache in the driver (e.g. in `init`). Same store name yields the same underlying store. |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +## 3. Non-Keyed State — Constructor Summary |
| 52 | + |
| 53 | +| State | With codec | AutoCodec | |
| 54 | +|-------|------------|-----------| |
| 55 | +| ValueState | `ValueState.from_context(ctx, store_name, codec)` | `ValueState.from_context_auto_codec(ctx, store_name)` | |
| 56 | +| ListState | `ListState.from_context(ctx, store_name, codec)` | `ListState.from_context_auto_codec(ctx, store_name)` | |
| 57 | +| MapState | `MapState.from_context(ctx, store_name, key_codec, value_codec)` or `MapState.from_context_auto_key_codec(ctx, store_name, value_codec)` | — | |
| 58 | +| PriorityQueueState | `PriorityQueueState.from_context(ctx, store_name, codec)` | `PriorityQueueState.from_context_auto_codec(ctx, store_name)` | |
| 59 | +| AggregatingState | `AggregatingState.from_context(ctx, store_name, acc_codec, agg_func)` | `AggregatingState.from_context_auto_codec(ctx, store_name, agg_func)` | |
| 60 | +| ReducingState | `ReducingState.from_context(ctx, store_name, value_codec, reduce_func)` | `ReducingState.from_context_auto_codec(ctx, store_name, reduce_func)` | |
| 61 | + |
| 62 | +All of the above can also be obtained via the corresponding `ctx.getOrCreate*` methods (e.g. `ctx.getOrCreateValueState(store_name, codec)`), which delegate to these constructors. |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +## 4. Keyed State — Factories and keyGroup / key / namespace |
| 67 | + |
| 68 | +**Keyed state is for keyed operators.** When the stream is partitioned by a key (e.g. after keyBy), each key gets isolated state. You obtain a **factory** once (from context, store name, **namespace**, and **key_group**), then create state **per primary key** (the stream key for the current record). |
| 69 | + |
| 70 | +### 4.1 keyGroup, key (primaryKey), and namespace |
| 71 | + |
| 72 | +| Term | API parameter | Meaning | |
| 73 | +|------|----------------|---------| |
| 74 | +| **key_group** | `key_group` when creating the factory | The **keyed group**: identifies which keyed partition/group this state belongs to (e.g. one group for "counters", another for "sessions"). | |
| 75 | +| **key** | The argument to factory methods (e.g. `new_keyed_value(primary_key)`) | The **value of the stream key** for the current record (e.g. user ID, partition key). Each distinct key value gets isolated state. | |
| 76 | +| **namespace** | `namespace` (bytes) when creating the factory | **If a window function is present**, use the **window identifier as bytes**. **Without windows**, pass **empty bytes** (e.g. `b""`). | |
| 77 | + |
| 78 | +### 4.2 Factory constructor summary (keyed) |
| 79 | + |
| 80 | +| Factory | With codec | AutoCodec | |
| 81 | +|---------|------------|-----------| |
| 82 | +| KeyedValueStateFactory | `KeyedValueStateFactory.from_context(ctx, store_name, namespace, key_group, value_codec)` | `KeyedValueStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, value_type=None)` | |
| 83 | +| KeyedListStateFactory | `KeyedListStateFactory.from_context(ctx, store_name, namespace, key_group, value_codec)` | `KeyedListStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, value_type=None)` | |
| 84 | +| KeyedMapStateFactory | `KeyedMapStateFactory.from_context(ctx, store_name, namespace, key_group, key_codec, value_codec)` | `KeyedMapStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, value_codec)` | |
| 85 | +| KeyedPriorityQueueStateFactory | `KeyedPriorityQueueStateFactory.from_context(ctx, store_name, namespace, key_group, item_codec)` | `KeyedPriorityQueueStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, item_type=None)` | |
| 86 | +| KeyedAggregatingStateFactory | `KeyedAggregatingStateFactory.from_context(ctx, store_name, namespace, key_group, acc_codec, agg_func)` | `KeyedAggregatingStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, agg_func, acc_type=None)` | |
| 87 | +| KeyedReducingStateFactory | `KeyedReducingStateFactory.from_context(ctx, store_name, namespace, key_group, value_codec, reduce_func)` | `KeyedReducingStateFactory.from_context_auto_codec(ctx, store_name, namespace, key_group, reduce_func, value_type=None)` | |
| 88 | + |
| 89 | +You can also use the corresponding `ctx.getOrCreateKeyed*Factory(...)` methods, which delegate to these constructors. |
| 90 | + |
| 91 | +--- |
| 92 | + |
| 93 | +## 5. Example: ValueState with from_context_auto_codec |
| 94 | + |
| 95 | +```python |
| 96 | +from fs_api import FSProcessorDriver, Context |
| 97 | +from fs_api.store import ValueState |
| 98 | + |
| 99 | +class CounterProcessor(FSProcessorDriver): |
| 100 | + def process(self, ctx: Context, source_id: int, data: bytes): |
| 101 | + # Create state per message (or cache in init) |
| 102 | + state = ValueState.from_context_auto_codec(ctx, "my-store") |
| 103 | + cur, _ = state.value() or (0, False) |
| 104 | + state.update(cur + 1) |
| 105 | + ctx.emit(str(cur + 1).encode(), 0) |
| 106 | +``` |
| 107 | + |
| 108 | +Same pattern for other state types: use `XxxState.from_context(ctx, store_name, ...)` or `XxxState.from_context_auto_codec(ctx, store_name)` as in the tables above. |
| 109 | + |
| 110 | +--- |
| 111 | + |
| 112 | +## 6. See also |
| 113 | + |
| 114 | +- [Python SDK Guide](python-sdk-guide.md) — main guide for fs_api, fs_client, and basic Context/KvStore usage. |
| 115 | +- [Go SDK Guide — Advanced State API](../Go-SDK/go-sdk-guide.md#7-advanced-state-api) — equivalent API in the Go SDK. |
0 commit comments