Skip to content

XML Support #20

@lschmierer

Description

@lschmierer

Hi, I would like to get a discussion about XML support rolling and potentially contribute to its development.

Unfortunately the serde approach does not map very well to the way FHIR XML and JSON models differ.
I wrote down two different proposals that both have their own trade-offs.
Maybe we can start a discussion on how XML support in HFS could look like.

Motivation

  • Add support for parsing and emitting FHIR XML alongside the existing JSON pipeline.
  • Preserve FHIR semantics for primitive extensions, choice elements, and resource type dispatch across both encodings.
  • Unblock downstream users that need to exchange XML bundles with systems that do not offer JSON.

Current State

  • Generated types derive FhirSerde, which emits JSON-specific Serialize/Deserialize impls.
  • Primitive metadata lives in underscore companions (e.g., value / _value), matching JSON but not XML.
  • No serializer context is threaded through the generated code, so per-format behavior cannot be toggled today.

XML Library Considerations

Both options require custom helios_xml wrapper functions. Unlike JSON (where serde_json is the de facto standard), there is no standard serde XML implementation. The semantics of quick-xml and serde-xml-rs differ significantly, making it impractical to support both interchangeably.

Therefore, regardless of which option is chosen, I would suggest:

  • Introducing a helios_xml module/crate with helios_xml::from_str(), helios_xml::to_string(), etc.
  • Picking one XML backend (likely quick-xml due to better performance and streaming support)
  • Encapsulating all XML-specific behavior behind this interface

Option 1: Context-Driven Serialization (DeserializeSeed Approach)

Extend generated code to support format-aware serialization and deserialization using context types and DeserializeSeed to thread context through the tree.

use helios_fhir::r4::Patient;
use helios_fhir::serde::{SerializationContext, DeserializationContext};
use helios_xml;

let patient = Patient::default();

// Existing JSON behavior (default `Serialize`)
let json = serde_json::to_string(&patient)?;

// JSON with context
let ctx = SerializationContext::json(&patient);
let json = serde_json::to_string(&ctx)?;

// XML with context (using helios_xml wrapper)
let ctx = SerializationContext::xml(&patient);
let xml = helios_xml::to_string(&ctx)?;

// or rather
let xml = helios_xml::to_string(&patient)?; // where helios_xml::to_string wraps SerializationContext::xml(&patient) internally

// Deserialization with context (using helios_xml wrapper)
let patient: Patient = helios_xml::from_str(xml_str)?;

Implementation Notes

  • Rebuild all serde impls to operate on SerializationContext<T> / DeserializationContext.
  • Implement DeserializeSeed for all types to enable context passing during deserialization
  • Context wraps values during serialization and provides seeds during deserialization.
  • Default Serialize/Deserialize impls continue to work for JSON without any context.

Context Design

struct DeserializationContext {
  format: Format,
  mode: DeserializationMode,  // Strict/Lax/Compatibility
  config: Config,             // User preferences
}
  • Enables strict/lax/compatibility modes and other custom behaviors (in the future)
  • See fhirbolt's implementation which implements this model

Pros

  • Proper solution using serde's intended mechanisms (DeserializeSeed)
  • Efficient: direct struct ↔ format with no semantic mismatch
  • Can evolve from minimal to rich context as needs grow
  • Preserves current JSON-oriented API as default (opt-in contexts)

Cons

  • Requires reworking all serde code to implement DeserializeSeed
  • More complex than Option 2 (custom serializer)
  • Direct usage with serde_json is less straight-forward

Option 2: Custom XML Serializer/Deserializer Built on JSON Model

Provide dedicated serde::Serializer/Deserializer implementations that internally translate the existing JSON-oriented structs into XML, without changing generated serde impls.

use helios_fhir::r4::Patient;
use helios_xml;

let patient = Patient::default();

// JSON remains unchanged
let json = serde_json::to_string(&patient)?;

// XML without changing struct serde impls
let xml = helios_xml::to_string(&patient)?;        // custom Serializer intercepts JSON model
let parsed: Patient = helios_xml::from_str(&xml)?; // custom Deserializer produces JSON model

Implementation Notes:

  • Implement custom serde::Serializer and serde::Deserializer that understand the JSON model's conventions.
  • The custom serializer intercepts calls like serialize_struct, examines field names for patterns (value/_value, valueString/valueInteger), and emits proper XML.
  • The custom deserializer reads XML and reconstructs the JSON-style struct with underscore extension fields.
  • No changes to existing serde impls.
  • Encapsulate all XML logic in separate helios-xml crate.

Pros:

  • Zero changes to generated structs or existing JSON code
  • Clean encapsulation: XML support is opt-in via separate crate
  • No breaking changes to current API
  • No or only few changes to existing marco-based Serialize/Deserialize implementations

Cons:

  • Performance penalty: semantic mismatch between JSON model and XML requires interpretation overhead
    • Basically round-trip: value (struct) -> _value (JSON) model -> value (XML)
  • Less extensibility (features like setting strict/lax mode can not be set at runtime)

Discussion

  • I think Option 1, although the most work, would be the most future-proof design
    • Is different levels in strictness and validation (an potential further customizations) sth. you would like to support at some point?
    • JSON support can likely be retained in a non-breaking manner
// Non-context API: Just delegate to context with JSON default
impl Serialize for Patient {
    fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error> {
        // Delegate to context-aware version
        SerializationContext::json(self).serialize(serializer)
    }
}

impl<'de> Deserialize<'de> for Patient {
    fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> {
        // Delegate to seed-based version with JSON context
        DeserializationContext<Patient>::json().deserialize(deserializer)
    }
}
  • On the other hand, option 2 is way easier to implement and the mental model is easier to understand

I would love to hear your thoughts on this!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions