Contribution: Add new audio/speech metrics for generative audio

## 🚀 Feature

Add new audio metrics for generative audio processing

### Motivation

The evaluation of speech processing  (denoising, dereverberation and in general enhancement) highly depends on audio metrics. Nowadays, generative AI is heavily used for speech/audio enhancement, becoming the new SOTA. However, the performance evaluation of speech enhancement with generative AI needs of **reference/target less** metrics that highly correlate with MOS (Mean Opinion Score). Currently implemented metrics do not allow for the correct assessment of generative speech enhancement algorithms (e.g. those based on diffusion or GANs) because they heavily rely on reference/target audio.

Newer metrics, such as DNSMOS, NISQA, CDPAM, WARPQ allow for a fundamented assessment of the performance of such algorithms (they are either reference-less or designed for generative methods). In addition, they have shown outperformance over traditional metrics (PESQ, STOI...) regarding MOS correlation.

### Pitch

It would be great to have these metrics included, as they are currently available in scattered repositories
[WARPQ](https://github.com/wjassim/WARP-Q)
[DNSMOS](https://github.com/microsoft/DNS-Challenge/tree/master/DNSMOS)
[CDPAM](https://github.com/pranaymanocha/PerceptualAudio/tree/master/cdpam)
[NISQA](https://github.com/gabrielmittag/NISQA)

### Alternatives

I cannot think of any 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Contribution: Add new audio/speech metrics for generative audio #2464

🚀 Feature

Motivation

Pitch

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Contribution: Add new audio/speech metrics for generative audio #2464

Description

🚀 Feature

Motivation

Pitch

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions