Skip to content

Conversation

@erankor
Copy link

@erankor erankor commented Mar 7, 2025

previously, missing keys with children were ignored in MapStruct marshal. this leads to missing values, and the code eventually panics in -
chunkMap[name] = layout.PagesToChunk(pages)
since pages is an empty slice in this case, and PagesToChunk assumes it has at least one element.

the fix is to push to the stack all the missing keys in the map -

  • simple fields (without children) get a null value (as was done previously)
  • nested fields (with children) get an empty map value

sample code that reproduces the issue -


type Object1 struct {
	ID     string   `parquet:"name=id, type=BYTE_ARRAY, convertedtype=UTF8"`
	Nested *Object2 `parquet:"name=nested"`
}

type Object2 struct {
	ID string `parquet:"name=id, type=BYTE_ARRAY, convertedtype=UTF8"`
}

func TestParquetNestingWithMap(t *testing.T) {
	r := require.New(t)

	buf := &bytes.Buffer{}
	pw, err := writer.NewParquetWriter(&parquetFileWriter{buf}, new(Object1), 2)
	r.NoError(err)

	item := map[string]any{
		"ID": "hello",
		// "Nested" implicitly initialized to nil
	}
	err = pw.Write(item)
	r.NoError(err)

	err = pw.WriteStop()
	r.NoError(err)
}

previously missing keys with children were ignored in MapStruct marshal.
this leads to missing values, and the code eventually panics in -
chunkMap[name] = layout.PagesToChunk(pages)
since pages is an empty slices in this case, and PagesToChunk assumes it
has at least one element.

the fix is to push to the stack all the missing keys in the map -
- simple fields (without children) get a null value (as was done
  previously)
- nested fields (with children) get an empty map value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant