Skip to content

Support non-sequence meta-features in PyGrain packing transformations.#1334

Merged
copybara-service[bot] merged 1 commit into
mainfrom
test_926448116
Jun 17, 2026
Merged

Support non-sequence meta-features in PyGrain packing transformations.#1334
copybara-service[bot] merged 1 commit into
mainfrom
test_926448116

Conversation

@copybara-service

@copybara-service copybara-service Bot commented Jun 4, 2026

Copy link
Copy Markdown

Support non-sequence meta-features in PyGrain packing transformations.

Previously, only sequence meta-features (defined in length_struct) were supported. Non-sequence meta-features (in meta_features but not in length_struct) were stripped or caused errors during packing.

This change adds support for non-sequence meta-features in both FirstFit and BestFit packing methods:

  • Python implementation (PackedBatch): Non-sequence meta-features are identified, accumulated as lists during packing, and yielded as 1D numpy object arrays of lists.
  • Python Iterator (PackingDatasetIterator): The _combined_struct is updated to include non-sequence meta-features so they are not stripped from input elements.
  • Refactored common key-extraction logic into shared helper functions in packing_packed_batch.py.
  • Added unit tests in testing_util.py to verify FirstFit and BestFit with non-sequence meta-features (both fixed and variable shapes).
  • Updated docstrings in packing.py to document the behavior of sequence vs non-sequence meta-features.

@copybara-service copybara-service Bot force-pushed the test_926448116 branch 5 times, most recently from 0ee3e90 to c2f7bc8 Compare June 17, 2026 02:23
Previously, only sequence meta-features (defined in `length_struct`) were supported. Non-sequence meta-features (in `meta_features` but not in `length_struct`) were stripped or caused errors during packing.

This change adds support for non-sequence meta-features in both FirstFit and BestFit packing methods:
- Python implementation (`PackedBatch`): Non-sequence meta-features are identified, accumulated as lists during packing, and yielded as 1D numpy object arrays of lists.
- Python Iterator (`PackingDatasetIterator`): The `_combined_struct` is updated to include non-sequence meta-features so they are not stripped from input elements.
- Refactored common key-extraction logic into shared helper functions in `packing_packed_batch.py`.
- Added unit tests in `testing_util.py` to verify FirstFit and BestFit with non-sequence meta-features (both fixed and variable shapes).
- Updated docstrings in `packing.py` to document the behavior of sequence vs non-sequence meta-features.

PiperOrigin-RevId: 933436511
@copybara-service copybara-service Bot merged commit 4af068d into main Jun 17, 2026
@copybara-service copybara-service Bot deleted the test_926448116 branch June 17, 2026 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant