Skip to content

Structured cbor #3036

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 25 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions benchmark/src/jmh/kotlin/kotlinx/benchmarks/cbor/CborBaseLine.kt
Original file line number Diff line number Diff line change
Expand Up @@ -52,11 +52,25 @@ open class CborBaseline {
}

val baseBytes = cbor.encodeToByteArray(KTestOuterMessage.serializer(), baseMessage)
val baseStruct = cbor.encodeToCborElement(KTestOuterMessage.serializer(), baseMessage)

@Benchmark
fun toBytes() = cbor.encodeToByteArray(KTestOuterMessage.serializer(), baseMessage)

@Benchmark
fun fromBytes() = cbor.decodeFromByteArray(KTestOuterMessage.serializer(), baseBytes)


@Benchmark
fun structToBytes() = cbor.encodeToByteArray(CborElement.serializer(), baseStruct)

@Benchmark
fun structFromBytes() = cbor.decodeFromByteArray(CborElement.serializer(), baseBytes)

@Benchmark
fun fromStruct() = cbor.decodeFromCborElement(KTestOuterMessage.serializer(), baseStruct)

@Benchmark
fun toStruct() = cbor.encodeToCborElement(KTestOuterMessage.serializer(), baseMessage)

}
133 changes: 132 additions & 1 deletion docs/formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,11 @@ stable, these are currently experimental features of Kotlin Serialization.
* [Tags and Labels](#tags-and-labels)
* [Arrays](#arrays)
* [Custom CBOR-specific Serializers](#custom-cbor-specific-serializers)
* [CBOR Elements](#cbor-elements)
* [Encoding from/to `CborElement`](#encoding-fromto-cborelement)
* [Tagging `CborElement`s](#tagging-cborelements)
* [Caution](#caution)
* [Types of CBOR Elements](#types-of-cbor-elements)
* [ProtoBuf (experimental)](#protobuf-experimental)
* [Field numbers](#field-numbers)
* [Integer types](#integer-types)
Expand Down Expand Up @@ -308,13 +313,125 @@ When annotated with `@CborArray`, serialization of the same object will produce
```
This may be used to encode COSE structures, see [RFC 9052 2. Basic COSE Structure](https://www.rfc-editor.org/rfc/rfc9052#section-2).


### Custom CBOR-specific Serializers
Cbor encoders and decoders implement the interfaces [CborEncoder](CborEncoder.kt) and [CborDecoder](CborDecoder.kt), respectively.
These interfaces contain a single property, `cbor`, exposing the current CBOR serialization configuration.
This enables custom cbor-specific serializers to reuse the current `Cbor` instance to produce embedded byte arrays or
react to configuration settings such as `preferCborLabelsOverNames` or `useDefiniteLengthEncoding`, for example.


### CBOR Elements

Aside from direct conversions between bytearray and CBOR objects, Kotlin serialization offers APIs that allow
other ways of working with CBOR in the code. For example, you might need to tweak the data before it can parse
or otherwise work with such unstructured data that it does not readily fit into the typesafe world of Kotlin
serialization.

The main concept in this part of the library is [CborElement]. Read on to learn what you can do with it.

#### Encoding from/to `CborElement`

Bytes can be decoded into an instance of `CborElement` with the [Cbor.decodeFromByteArray] function by either manually
specifying [CborElement.serializer()] or specifying [CborElement] as generic type parameter.
It is also possible to encode arbitrary serializable structures to a `CborElement` through [Cbor.encodeToCborElement].

Since these operations use the same code paths as regular serialization (but with specialized serializers), the config flags
behave as expected:

```kotlin
fun main() {
val element: CborElement = Cbor.decodeFromHexString("a165627974657343666f6f")
println(element)
}
```

The above snippet will print the following diagnostic notation

```text
CborMap(tags=[], content={CborString(tags=[], value=bytes)=CborByteString(tags=[], value=h'666f6f)})
```

#### Tagging `CborElement`s

Every CborElement—whether it is used as a property, a value inside a collection, or even a complex key inside a map
(which is perfectly legal in CBOR)—supports tags. Tags can be specified by passing them s varargs parameters upon
CborElement creation.
For example, take following structure (represented in diagnostic notation):

<!--- TEST -->

```hexdump
bf # map(*)
61 # text(1)
61 # "a"
cc # tag(12)
1a 0fffffff # unsigned(268,435,455)
d8 22 # base64 encoded text, tag(34)
61 # text(1)
62 # "b"
# invalid length at 0 for base64
20 # negative(-1)
d8 38 # tag(56)
61 # text(1)
63 # "c"
d8 4e # typed array of i32, little endian, twos-complement, tag(78)
42 # bytes(2)
cafe # "\xca\xfe"
# invalid data length for typed array
61 # text(1)
64 # "d"
d8 5a # tag(90)
cc # tag(12)
6b # text(11)
48656c6c6f20576f726c64 # "Hello World"
ff # break
```

Decoding it results in the following CborElement (shown in manually formatted diagnostic notation):

```
CborMap(tags=[], content={
CborString(tags=[], value=a) = CborPositiveInt( tags=[12], value=268435455),
CborString(tags=[34], value=b) = CborNegativeInt( tags=[], value=-1),
CborString(tags=[56], value=c) = CborByteString( tags=[78], value=h'cafe),
CborString(tags=[], value=d) = CborString( tags=[90, 12], value=Hello World)
})
```

##### Caution

Tags are properties of `CborElements`, and it is possible to mixing arbitrary serializable values with `CborElement`s that
contain tags inside a serializable structure. It is also possible to annotate any [CborElement] property
of a generic serializable class with `@ValueTags`.
**This can lead to asymmetric behavior when serializing and deserializing such structures!**

#### Types of CBOR Elements

A [CborElement] class has three direct subtypes, closely following CBOR grammar:

* [CborPrimitive] represents primitive CBOR elements, such as string, integer, float boolean, and null.
CBOR byte strings are also treated as primitives
Each primitive has a [value][CborPrimitive.value]. Depending on the concrete type of the primitive, it maps
to corresponding Kotlin Types such as `String`, `Int`, `Double`, etc.
Note that Cbor discriminates between positive ("unsigned") and negative ("signed") integers!
`CborPrimitive` is itself an umbrella type (a sealed class) for the following concrete primitives:
* [CborNull] mapping to a Kotlin `null`
* [CborBoolean] mapping to a Kotlin `Boolean`
* [CborInt] which is an umbrella type (a sealed class) itself for the following concrete types
(it is still possible to instantiate it as the `invoke` operator on its companion is overridden accordingly):
* [CborPositiveInt] represents all `Long` numbers `≥0`
* [CborNegativeInt] represents all `Long` numbers `<0`
* [CborString] maps to a Kotlin `String`
* [CborFloat] maps to Kotlin `Double`
* [CborByteString] maps to a Kotlin `ByteArray` and is used to encode them as CBOR byte string (in contrast to a list
of individual bytes)

* [CborList] represents a CBOR array. It is a Kotlin [List] of `CborElement` items.

* [CborMap] represents a CBOR map/object. It is a Kotlin [Map] from `CborElement` keys to `CborElement` values.
This is typically the result of serializing an arbitrary


## ProtoBuf (experimental)

[Protocol Buffers](https://developers.google.com/protocol-buffers) is a language-neutral binary format that normally
Expand Down Expand Up @@ -1673,5 +1790,19 @@ This chapter concludes [Kotlin Serialization Guide](serialization-guide.md).
[Cbor.decodeFromByteArray]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor/decode-from-byte-array.html
[CborBuilder.ignoreUnknownKeys]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-builder/ignore-unknown-keys.html
[ByteString]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-byte-string/index.html
[CborElement]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-element/index.html
[Cbor.encodeToCborElement]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/encode-to-cbor-element.html
[CborPrimitive]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-primitive/index.html
[CborPrimitive.value]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-primitive/value.html
[CborNull]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-null/index.html
[CborBoolean]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-boolean/index.html
[CborInt]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-int/index.html
[CborPositiveInt]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-positive-int/index.html
[CborNegativeInt]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-negative-int/index.html
[CborString]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-string/index.html
[CborFloat]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-float/index.html
[CborByteString]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-byte-string/index.html
[CborList]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-list/index.html
[CborMap]: https://kotlinlang.org/api/kotlinx.serialization/kotlinx-serialization-cbor/kotlinx.serialization.cbor/-cbor-map/index.html

<!--- END -->
5 changes: 5 additions & 0 deletions docs/serialization-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,11 @@ Once the project is set up, we can start serializing some classes.
* <a name='tags-and-labels'></a>[Tags and Labels](formats.md#tags-and-labels)
* <a name='arrays'></a>[Arrays](formats.md#arrays)
* <a name='custom-cbor-specific-serializers'></a>[Custom CBOR-specific Serializers](formats.md#custom-cbor-specific-serializers)
* <a name='cbor-elements'></a>[CBOR Elements](formats.md#cbor-elements)
* <a name='encoding-fromto-cborelement'></a>[Encoding from/to `CborElement`](formats.md#encoding-fromto-cborelement)
* <a name='tagging-cborelements'></a>[Tagging `CborElement`s](formats.md#tagging-cborelements)
* <a name='caution'></a>[Caution](formats.md#caution)
* <a name='types-of-cbor-elements'></a>[Types of CBOR Elements](formats.md#types-of-cbor-elements)
* <a name='protobuf-experimental'></a>[ProtoBuf (experimental)](formats.md#protobuf-experimental)
* <a name='field-numbers'></a>[Field numbers](formats.md#field-numbers)
* <a name='integer-types'></a>[Integer types](formats.md#integer-types)
Expand Down
Loading