Skip to content

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]

Notifications You must be signed in to change notification settings

Audio-WestlakeU/FN-SSL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

93 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Full-band and Narrow-band fusion Network for SSL

Introduction

This repository provides methods which based on full-band and narrow-band fusion network for sound source localization. The narrow-band module processes the along-time sequences to focus on learning these narrow-band spatial information. The full-band module processes the along-frequency sequence to focus on learning the full-band correlation of spatial cues, such as the linear relation of DP-IPD to frequency.

Methods

Thress official implemented sound source localization methods are included:

Datasets

Quick start (will be update soon)

  • Preparation

    • Download the required dataset and organize the data according to the data_org in the data folder.
    • Generate multi-channel data, You can set data_num (in Simu.py) to control the size of the dataset. --train, -- test, --dev are used to control the generation of train dataset, test dataset, and validation dataset, respectively. The source data path of them are specified by dirs ['sousig_train '] in Opt.py.
    python Simu.py --train/--test/--dev
    
  • Training

    • We have implemented both FN-SSL and IPDnet using the Pytorch-lightning framework.
    • For Train,
    python main.py fit --data.batch_size=[*,*] --trainer.devices=*,*
    
    • For test,
    python main.py test  --ckpt_path logs/MyModel/version_x/checkpoints/**.ckpt --trainer.devices=*,*
    
  • Pretrained models

    • Using the FN_lightning model to load the lightning checkpoint in torch framework.
Framework Task Checkpoint
Lightning DP-IPD regression (FN-SSL) https://pan.baidu.com/s/1zRKpiqbSuo80Xu5ZRoS1gQ?pwd=6w51
Lightning DOA classification (FN-SSL) https://pan.baidu.com/s/1U1Wl5ZBZBItc2Vku7AyqNA?pwd=ceqm

more checkpoints will be update soon.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{wang2023fnssl,
    author = "Yabo Wang and Bing Yang and Xiaofei Li",
    title = "FN-SSL: Full-Band and Narrow-Band Fusion for Sound Source Localization",
    booktitle = "Proceedings of INTERSPEECH",
    year = "2023",
    pages = ""}

Reference code

We gratefully acknowledge the Cross3D and icoCNN, which provided the foundation for the dataset generation and simulation procedures used here.

Licence

MIT

About

The Official PyTorch Implementation of FN-SSL & IPDnet for Sound Source Localization [INTERSPEECH2023 & TASLP2024]

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages