`LXMessage.unpack`: check if `title` and `content` are byte arrays or strings by leroycep · Pull Request #26 · markqvist/LXMF

leroycep · 2025-02-26T20:20:27Z

If a the msgpacked portion of an LXMessage uses strings (fixstr tags 0xa0 to 0xbf, and probably str_8 (0xd9), str_16 (0xda), and str_32 (0xdb) as well) instead of byte arrays, LXMF will happily unpack them but then crash when trying to decode the string. For example:

0090: 94 CB 41 D9 EF DB 0D 00  00 00 AC 48 65 6C 6C 6F  ..A........Hello
00a0: 2C 20 6C 78 6D 66 21 A0  C0                       , lxmf!..

(Side note, is there a reason to LXMF uses byte arrays in the msgpack instead of strings? Like, does msgpack not specify UTF-8 as the string format or something?)

This PR changes unpack to have logic similar to LXMessage.__init__, where it will always decode it into UTF-8 byte array based on the type.

markqvist · 2025-02-27T11:44:53Z

Content and titles should never be strings, but always binary data in UTF-8 encoding. To support any language and character set, LXMF only supports UTF-8 for parts that are intended for human readability.

This should have been made more explicit, though, and I think the right way to go about this is to not accept messages where the byte arrays of title and content cannot be decoded as UTF-8. It should definitely not crash, of course ;)

As you can see in set_content_from_string() and set_title_from_string(), the passed string values are encoded to UTF-8, and then set on LXMessage instance. These helper methods exist for message creation. Over the wire, there should never be any string data types in those parts of the message-packed data structure.

leroycep · 2025-02-27T15:49:55Z

Quoting the msgpack spec:

  * **String** extending Raw type represents a UTF-8 string

It seems like msgpack defines the string type to mean a UTF-8 encoded byte array. That implies that the implementation should be switched to use only msgpack strings and not byte array, no?

check if the decoded messages are bytes or strings

1200dce

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`LXMessage.unpack`: check if `title` and `content` are byte arrays or strings#26

`LXMessage.unpack`: check if `title` and `content` are byte arrays or strings#26
leroycep wants to merge 1 commit intomarkqvist:masterfrom
leroycep:fix-string-messages-crash

leroycep commented Feb 26, 2025

Uh oh!

markqvist commented Feb 27, 2025

Uh oh!

leroycep commented Feb 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

leroycep commented Feb 26, 2025

Uh oh!

markqvist commented Feb 27, 2025

Uh oh!

leroycep commented Feb 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants