Strip invalid char data from strings on save #329

hahn-kev · 2025-06-17T04:29:15Z

relates to #328

I didn't see anything stripping out data, so I've gone ahead and done that via a compiled regex.

This change is

github-actions · 2025-06-17T04:37:08Z

LCM Tests

16 files ± 0 16 suites ±0 2m 52s ⏱️ -2s
2 846 tests +12 2 826 ✅ +12 20 💤 ±0 0 ❌ ±0
11 332 runs +48 11 164 ✅ +48 168 💤 ±0 0 ❌ ±0

Results for commit cbd23aa. ± Comparison against base commit 0eb28b5.

♻️ This comment has been updated with latest results.

jasonleenaylor

Reviewed 4 of 4 files at r2, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @hahn-kev)

hahn-kev · 2025-06-23T06:11:19Z

@jasonleenaylor I've added a link to a Wikipedia page documenting valid xml chars. One thing it mentioned is that U+10000–U+10FFFF is also valid, we have not included that here. Is that ok? or do we need to include it? I realized this has the potential to strip valid data on save if we get this wrong. So it might be worth a bit more checking to ensure this is correct otherwise users could start losing data on save without noticing right away.

jasonleenaylor · 2025-07-24T19:59:45Z

Sorry that I didn't see your comment before, yes we should include that range. It is very likely someone using FieldWorks will have a character in that range.

…ring they aren't stripped

hahn-kev · 2025-08-07T07:01:45Z

@jasonleenaylor I worked with Martin H and rewrote the regex to just match the invalid characters that we want to remove, and he helped me come up with a test case which was outside the normal range.

jasonleenaylor

@jasonleenaylor reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @hahn-kev)

write a test for xml serialization

60c6dc9

hahn-kev added 3 commits June 17, 2025 13:04

use regex to strip bad chars from strings before they get into xml

54e3b80

flush xml writer to ensure the memory stream isn't empty

099990e

use strip invalid xml chars for Unicode strings

3f891a1

hahn-kev marked this pull request as ready for review June 18, 2025 02:38

jasonleenaylor approved these changes Jun 19, 2025

View reviewed changes

add a link which documents the valid xml chars

9334b53

hahn-kev added 3 commits August 6, 2025 16:32

include characters in the range U+10000–U+10FFFF, and add a test ensu…

07aa6ce

…ring they aren't stripped

use the correct mather for the surrogate pair code range

f44b8d4

switched regex to match on invalid chars

cbd23aa

Merge branch 'master' into xml-invalid-char-data

10f8846

jasonleenaylor approved these changes Aug 28, 2025

View reviewed changes

jasonleenaylor enabled auto-merge (squash) August 28, 2025 16:10

jasonleenaylor merged commit 7b63718 into master Aug 28, 2025
4 checks passed

jasonleenaylor deleted the xml-invalid-char-data branch August 28, 2025 16:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Strip invalid char data from strings on save #329

Strip invalid char data from strings on save #329

hahn-kev commented Jun 17, 2025 •

edited by ddaspit

Loading

Uh oh!

github-actions bot commented Jun 17, 2025 •

edited

Loading

Uh oh!

jasonleenaylor left a comment

Uh oh!

hahn-kev commented Jun 23, 2025

Uh oh!

jasonleenaylor commented Jul 24, 2025

Uh oh!

hahn-kev commented Aug 7, 2025

Uh oh!

jasonleenaylor left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Strip invalid char data from strings on save #329

Strip invalid char data from strings on save #329

Conversation

hahn-kev commented Jun 17, 2025 • edited by ddaspit Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

LCM Tests

Uh oh!

jasonleenaylor left a comment

Choose a reason for hiding this comment

Uh oh!

hahn-kev commented Jun 23, 2025

Uh oh!

jasonleenaylor commented Jul 24, 2025

Uh oh!

hahn-kev commented Aug 7, 2025

Uh oh!

jasonleenaylor left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hahn-kev commented Jun 17, 2025 •

edited by ddaspit

Loading

github-actions bot commented Jun 17, 2025 •

edited

Loading