Full-band and Narrow-band fusion Network for SSL

Introduction

This repository provides methods which based on full-band and narrow-band fusion network for sound source localization. The narrow-band module processes the along-time sequences to focus on learning these narrow-band spatial information. The full-band module processes the along-frequency sequence to focus on learning the full-band correlation of spatial cues, such as the linear relation of DP-IPD to frequency.

Methods

Thress official implemented sound source localization methods are included:

Datasets

Source signals: from LibriSpeech database
Noise source signals: from Noise92
Real-world multi-channel microphone signals: from LOCATA database
RealMAN dataset: from RealMAN

Quick start (will be update soon)

Preparation
- Download the required dataset and organize the data according to the data_org in the data folder.
- Generate multi-channel data, You can set data_num (in Simu.py) to control the size of the dataset. --train, -- test, --dev are used to control the generation of train dataset, test dataset, and validation dataset, respectively. The source data path of them are specified by dirs ['sousig_train '] in Opt.py.
```
python Simu.py --train/--test/--dev
```

Training

We have implemented both FN-SSL and IPDnet using the Pytorch-lightning framework.
For Train,

python main.py fit --data.batch_size=[*,*] --trainer.devices=*,*

For test,

python main.py test  --ckpt_path logs/MyModel/version_x/checkpoints/**.ckpt --trainer.devices=*,*

Pretrained models
- Using the FN_lightning model to load the lightning checkpoint in torch framework.

Framework	Task	Checkpoint
Lightning	DP-IPD regression (FN-SSL)	https://pan.baidu.com/s/1zRKpiqbSuo80Xu5ZRoS1gQ?pwd=6w51
Lightning	DOA classification (FN-SSL)	https://pan.baidu.com/s/1U1Wl5ZBZBItc2Vku7AyqNA?pwd=ceqm

more checkpoints will be update soon.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{wang2023fnssl,
    author = "Yabo Wang and Bing Yang and Xiaofei Li",
    title = "FN-SSL: Full-Band and Narrow-Band Fusion for Sound Source Localization",
    booktitle = "Proceedings of INTERSPEECH",
    year = "2023",
    pages = ""}

Reference code

We gratefully acknowledge the Cross3D and icoCNN, which provided the foundation for the dataset generation and simulation procedures used here.

Licence

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
FN-SSL		FN-SSL
IPDnet		IPDnet
IPDnet2		IPDnet2
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Full-band and Narrow-band fusion Network for SSL

Introduction

Methods

Datasets

Quick start (will be update soon)

Citation

Reference code

Licence

About

Uh oh!

Packages

Contributors 2

Languages

Audio-WestlakeU/FN-SSL

Folders and files

Latest commit

History

Repository files navigation

Full-band and Narrow-band fusion Network for SSL

Introduction

Methods

Datasets

Quick start (will be update soon)

Citation

Reference code

Licence

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Packages 0

Contributors 2

Languages

Packages