Skip to content

Add a caching mechanism for benchmark and tuning #196

@juanmc2005

Description

@juanmc2005

Problem

It's getting more and more difficult to tune and evaluate diarization pipelines with different models or combinations of models, even with a GPU.

Idea

Implement a caching mechanism to save segmentation and embedding outputs to disk. For example, we could use ~/.diart/cache by default, and even allow users to change it with --cache.
This could be implemented as an additional parameter of SpeakerDiarization:

pipeline = SpeakerDiarization(config, cache="default")

Where cache: str | Path | None. Using cache=None would prevent caching, cache="default" would use ~/.diart/cache and cache=Path(/some/dir) or cache="/some/dir" would dump/load the cache to/from that directory.

The caching logic could even be implemented as a wrapper of SegmentationModel and EmbeddingModel.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature or request

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions