Skip to content

Conversation

sebschu
Copy link

@sebschu sebschu commented May 22, 2025

The "replace" transformation is currently not implemented. This is an implementation that allows users to specify a string of the form pattern/replacement to replace substitute replacement for pattern. In the background it runs re.sub, so match groups can be included in the replacement string.

Slashes can be escaped using a backslash, e.g, my\/path/your\/path to replace my/path with your/path.

Functional tests are implemented in python/mlcroissant/mlcroissant/_src/operation_graph/operations/field_test.py.

I followed the syntax of python/mlcroissant/mlcroissant/_src/structure_graph/nodes/source_test.py#L19 for the replacement string.

sebschu added 6 commits May 22, 2025 16:56
Implements correct functionality for "replace" attirbute in mlc.Transform. It takes a string of the form "pattern/replacement" and peforms re.sub with `pattern` and `replacement`.

Slashes can be escaped using a backslash, e.g, "my\/path/your\/path".
Implements correct functionality for "replace" attirbute in mlc.Transform. It takes a string of the form "pattern/replacement" and peforms re.sub with `pattern` and `replacement`.

Slashes can be escaped using a backslash, e.g, "my\/path/your\/path".
@sebschu sebschu requested a review from a team as a code owner May 22, 2025 16:07
Copy link

github-actions bot commented May 22, 2025

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

@sebschu
Copy link
Author

sebschu commented May 22, 2025

I added my GitHub ID when signing up for the newsletter. Please let me know if there is anything else I have to do to sign the CLA.

@ccl-core ccl-core self-requested a review May 23, 2025 10:21
@ccl-core
Copy link
Contributor

Hi @sebschu , thank you for your contribution!
To merge PRs, you will need to sign the MLCommons Association CLA at: https://mlcommons.org/community/subscribe/

@sebschu
Copy link
Author

sebschu commented May 27, 2025

recheck

@ccl-core
Copy link
Contributor

Could you please also add details on this transformation to the new 1.1 specs?
https://github.com/mlcommons/croissant/blob/main/docs/croissant-spec-draft.md#transform

Thank you!

@ccl-core
Copy link
Contributor

Thank you! Looks great, and it seems that the CLA check passes now :)

@sebschu
Copy link
Author

sebschu commented May 27, 2025

I fixed the two issues and added a brief description to the 1.1 specs draft.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants