Auto backend Selection#195
Merged
Merged
Conversation
…date_sklearn_pipeline # Conflicts: # melusine/pipeline.py
# Conflicts: # melusine/__init__.py # melusine/backend/active_backend.py # melusine/backend/base_backend.py # tests/backend/test_backends.py # tests/detectors/test_basic_detector.py # tests/processors/test_processors.py
There was a problem hiding this comment.
Pull request overview
This PR refactors Melusine’s backend management to support multiple registered backends and automatic backend selection based on input data type, while updating tests and documentation to reflect the new debug-mode usage and adding a small email-segmentation enhancement.
Changes:
- Refactor
ActiveBackendto maintain a prioritized backend list and auto-select a backend viaselect_backend()based onsupported_types. - Add
supported_typesto the backend interface and implement it for dict/pandas backends. - Update tests/docs for backend behavior and debug-mode usage; extend segmentation keywords.
Reviewed changes
Copilot reviewed 21 out of 21 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
melusine/backend/active_backend.py |
Introduces backend list management, backend auto-selection, and updates backend operations to route through selected backend. |
melusine/backend/base_backend.py |
Extends backend interface with supported_types. |
melusine/backend/dict_backend.py |
Declares dict support via supported_types. |
melusine/backend/pandas_backend.py |
Declares DataFrame support via supported_types. |
melusine/processors.py |
Adds segmentation keywords (“Copie”, “Attachments”). |
melusine/utils/show_versions.py |
Adjusts coverage pragmas in dependency version probing. |
melusine/__init__.py |
Exports MelusinePipeline at top-level and bumps version to 3.3.1. |
tests/conftest.py |
Replaces dict-backend fixture with a backend reset fixture. |
tests/backend/test_backends.py |
Adds coverage for backend selection and backend list behaviors. |
tests/processors/test_processors.py |
Updates processor test to call fit() before use. |
tests/pipeline/test_pipeline_basic.py |
Adds debug-mode test coverage and adjusts basic pipeline flow. |
tests/pipeline/test_pipeline_testing.py |
Removes explicit dict-backend fixture usage. |
tests/io_mixin/test_io_mixin.py |
Removes explicit dict-backend fixture usage. |
tests/functional/test_emails_generic.py |
Removes explicit dict-backend fixture usage. |
tests/detectors/test_thanks_detector.py |
Removes explicit dict-backend fixture usage. |
tests/detectors/test_basic_detector.py |
Adjusts dict-based detector test to call fit(). |
tests/detectors/test_emergency_detector.py |
Adds a new detector test focusing on debug output. |
docs/tutorials/05b_MelusineDetectorsAdvanced.md |
Updates debug-mode documentation to emphasize debug_mode argument. |
docs/docs_src/BasicClassification/tutorial001.py |
Updates example to use debug_mode=True instead of df.debug. |
docs/docs_src/GettingStarted/tutorial002.py |
Updates example to use debug_mode=True instead of df.debug. |
docs/docs_src/MelusineDetectors/tutorial003.py |
Updates example to use debug_mode=True instead of df.debug. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces improvements to the backend management system in Melusine with dynamic backend selection based on data type. It also updates documentation and tests to reflect these changes, and adds minor features and fixes.
Backend system improvements:
ActiveBackendto support a list of backends (backend_list) instead of a single backend, allowing multiple backends to be registered and prioritized. The backend is now automatically selected based on the data type using a newselect_backendmethod. Added methods to add and reset backends, and updated all backend operations to use the selected backend. (melusine/backend/active_backend.py)supported_typesproperty in theBaseTransformerBackendabstract class and implemented it inDictBackendandPandasBackendto declare supported data types for each backend. (melusine/backend/base_backend.py,melusine/backend/dict_backend.py,melusine/backend/pandas_backend.py)Debug mode documentation:
docs/tutorials/05b_MelusineDetectorsAdvanced.md,docs/docs_src/BasicClassification/tutorial001.py,docs/docs_src/GettingStarted/tutorial002.py,docs/docs_src/MelusineDetectors/tutorial003.py)Other improvements and fixes:
melusine/processors.py)MelusinePipelinein the top-level package. (melusine/__init__.py)