AUDIO SPECTOGRAM TRANSFORMER
Torch implementation of ViT based classifier which achieved 97% accuracy on Audio FSDD dataset.
ViT Audio Classifier (acc 97%) |
|---|
Resnet Audio Classifier (93%) |
Resnet with PolyLoss (93%) |
Check other branches for the comparison with Resnet and Resnet + Polyloss code




