-
Notifications
You must be signed in to change notification settings - Fork 1k
Open
Labels
Description
Describe the bug
High memory use
To Reproduce
Writer with compressed page with parquet v1
Expected behavior
Less memory use
Additional context
Buffer is not shrink after compression and Bytes
don't change memory layout
arrow-rs/parquet/src/column/writer/mod.rs
Lines 1074 to 1076 in b9c2bf7
let mut compressed_buf = Vec::with_capacity(uncompressed_size); | |
cmpr.compress(&buffer[..], &mut compressed_buf)?; | |
buffer = compressed_buf; |
In my case uncompressed page ~1M and after compression ~20k It's lot of memory wasted
Before change
(3,639,172,424B) 0x41C901F: parquet::column::writer::GenericColumnWriter<E>::add_data_page (mod.rs:1070)
(876,240,384B) 0x41D2F36: parquet::column::writer::GenericColumnWriter<E>::add_data_page (mod.rs:1070)
For output file ~550M
alamb