Skip to content

Splitting nested columns into their own tables has issues with arrays #11

Description

@GeorgelPreput

The function _create_split_table has logic to handle array columns, but the approach is somewhat simplistic:

It uses posexplode to transform array elements into separate rows, along with their positions in the original array. This is a good start, but without additional context or a robust mechanism to link these rows back to their parent data, it may not fully address the needs of complex data analysis tasks.

The generated key ({col_name}_key) is used to maintain a link between the exploded elements and the original data. However, this strategy may not be sufficient for deeply nested structures or arrays within arrays, where multiple levels of keys might be necessary to preserve the full data hierarchy.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions