Apache Iceberg Rust version
None
Describe the bug
When a table's struct (or nested list/map) column has gained fields over time via schema evolution, reading data files written under the older schema fails with an Arrow cast error such as Cast error: Casting from Utf8 to Struct(...). The record-batch transformer reconciles a file's nested children to the table schema by position within the struct rather than by Iceberg field id, so once a nested struct adds a field, the children no longer line up and a mismatched cast is attempted (e.g. casting a string child into a struct slot). Files are valid and readable by Iceberg-Java/Spark.
e.g. struct goes from a, c to a, b, c -> when reading old file with only a, c it tries to cast c to type of b
To Reproduce
- Create a table with a column s struct<a: string> (plus other columns).
- Write a data file.
- Evolve the schema to s struct<a: string, b: long> (add a nested field), keeping field ids stable.
- Read the table (the older file still has s with only a).
- The scan errors with Cast error: Casting from Utf8 to Struct(...).
Expected behavior
Nested struct/list/map children are reconciled to the table schema by field id (recursively), and fields present in the table schema but absent from the file are materialized as typed NULLs — matching Iceberg's column-projection-by-id semantics. The read should succeed.
Willingness to contribute
I can contribute a fix for this bug independently
Apache Iceberg Rust version
None
Describe the bug
When a table's struct (or nested list/map) column has gained fields over time via schema evolution, reading data files written under the older schema fails with an Arrow cast error such as Cast error: Casting from Utf8 to Struct(...). The record-batch transformer reconciles a file's nested children to the table schema by position within the struct rather than by Iceberg field id, so once a nested struct adds a field, the children no longer line up and a mismatched cast is attempted (e.g. casting a string child into a struct slot). Files are valid and readable by Iceberg-Java/Spark.
e.g. struct goes from a, c to a, b, c -> when reading old file with only a, c it tries to cast c to type of b
To Reproduce
Expected behavior
Nested struct/list/map children are reconciled to the table schema by field id (recursively), and fields present in the table schema but absent from the file are materialized as typed NULLs — matching Iceberg's column-projection-by-id semantics. The read should succeed.
Willingness to contribute
I can contribute a fix for this bug independently