Scene-wide Acoustic Parameter Estimation

From WASPAA 2025 Paper: Scene-wide Acoustic Parameter Estimation

Authors: Ricardo Falcon-Perez¹, Ruohan Gao², Gregor Mueckl, Sebastia V. Amengual Gari³, Ishwarya Ananthabhotla³
¹ Aalto University, Finland ² University of Maryland, College Park ³ Meta - Reality Labs Research, USA

Citation

If you use the model, code, or the MRAS dataset please cite this work as:

@inproceedings{falconperez2025,
    author  = "{Falcón Pérez}, Ricardo and Gao, Ruohan and Mueckl, Gregor, and {Amengual Gari}, Sebastia V. and Ananthabhotla, Ishwarya",
    year    = {2025},
    title   = {Scene Wide Acoustic Parameter Estimation},
    booktitle   = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
}

Overview

This is the official repository for the WASPAA 2025 paper Scene-wide Acoustic Parameter Estimation. The goal of this project is to predict heatmaps of acoustic parameters covering the full scene given a particular source in a single inference step. We also include the Multi-room Apartments Simulation (MRAS) dataset. This is a large scale synthetic dataset that models indoor scenes by connecting multiple rooms. For more details of the dataset please refer to the paper.

This repo contains:

Pre trained model
Code (training, inference, dataset preprocessing)
MRAS Dataset
- Raw [coming soon]
  - Scene geometries
  - 2nd order ambisonic rirs
  - Acoustic parameters and other metadata
- Preprocessed (packed as LMDB)
  - Scene floormaps
  - Acoustic heatmaps
  - RIRs

Data

This repo includes the preprocessed MRAS/Replica data (via Git LFS) as ZIP files and expects the following layout:

Data/
  preprocessed/         # LMDBs as zip files (tracked with Git LFS)
    mras_relfloor_10x10_moreparams.lmdb.zip
    replica_relcenter_10x10_moreparams.lmdb.zip
    rirs_mono_mras_grids.lmdb.zip
    rirs_mono_scenes_18.lmdb.zip
  raw/                  # Raw data (not yet included)

Getting the data

Install and pull LFS content:

git lfs install
git lfs pull

Unzip LMDB archives in place:

unzip Data/preprocessed/*.zip -d Data/preprocessed/

Note on raw data: Data/raw/ is reserved for the raw MRAS assets (scene geometries, ambisonic RIRs, etc.) and is not included yet.

Prerequisites

Creating the environment

Use conda (or mamba) to install the pacakges in environment.yml. For example:

conda config --add channels conda-forge  
conda config --show channels
conda env create -f environment.yml -n sceacoustics

Alternatively, the environment can be created manually. This can make it easier to avoid conflicts in some systems:

conda config --add channels conda-forge  
conda config --show channels
conda env create -n sceacoustics python=3.10 numpy matplotlib seaborn 
source activate sceacoustics
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install tensorboard torchinfo scikit-learn mlflow scipy lmdb python-lmdb tqdm pyyaml easydict configargparse jupyterlab pot wandb
conda install librosa -c conda-forge   # We need librosa to enable soundfile backend for torchaudio
pip install spaudiopy hexagdly open3d torchmetrics ptwt  # NOTE: open3d is tricky to install in headless mode in some linux environments

Then we can initialize the submodules and download them.

git submodule init
git submodule update --recursive

Finally, add the submodules to the environment.

cd ranger...  # navigate to local ranger directory in the repo
pip install -e .

conda create env -n scenewide -f environment.yml

Get the submodules

git sub update decayfitnet

Training: Reproducing the Main Experiments

The paths/filenames below assume you’ve unzipped LMDBs into Data/preprocessed/.

Training with MRAS:

  python train_basic.py -c configs/train_more_parameters.yaml \
    --exp_name $exp_n --use_triton --job_id $job_id --task_id $param \
    --num_workers $num_w --seed 1111 \
    --dataset 'mras' --fold 'fixed_1' \
    --n_files_per_scene 10000000 --max_length 24000 --use_augmentation_getitem \
    --fmap_use_soft_sources \
    --read_lmdb --fname_lmdb 'rirs_mono_mras_grids.lmdb' \
    --read_lmdb_maps --fname_lmdb_maps 'mras_relfloor_10x10_moreparams.lmdb' \
    --rir_output_channels 0

Training with Replica

  python train_basic.py -c configs/mras_more_parameters.yaml \
    --exp_name $exp_n --use_triton --job_id $job_id --task_id $param \
    --num_workers $num_w --seed 1111 \
    --dataset 'replica' --fold 'balanced_1' \
    --n_files_per_scene 10000000 --max_length 24000 --use_augmentation_getitem \
    --read_lmdb --fname_lmdb 'rirs_mono_scenes_18.lmdb' \
    --read_lmdb_maps --fname_lmdb_maps 'replica_relcenter_10x10_moreparams.lmdb' \
    --rir_output_channels 0

Key flags & expected files

--read_lmdb --fname_lmdb:
    rirs_mono_mras_grids.lmdb (MRAS RIRs)
    rirs_mono_scenes_18.lmdb (Replica RIRs)

--read_lmdb_maps --fname_lmdb_maps:
    mras_relfloor_10x10_moreparams.lmdb (MRAS maps)
    replica_relcenter_10x10_moreparams.lmdb (Replica maps)

Output channels: --rir_output_channels 0 (mono)

Inference only:

python train_basic.py -c configs/train_more_parameters.yaml --exp_name $exp_n --use_triton  --job_id $job_id --task_id $param  --num_workers $num_w --seed 1111 \
    --validation_checkpoint 7033621_3_table01_1111_10x10_moreparams_replica_triton_replica_balanced_4 --do_validation \
    --dataset 'replica' --fold 'balanced_4' --n_files_per_scene 10000000 --max_length 24000 --use_augmentation_getitem \
    --read_lmdb --fname_lmdb 'rirs_mono_scenes_18.lmdb' --read_lmdb_maps --fname_lmdb_maps 'replica_relcenter_10x10_moreparams.lmdb' --rir_output_channels 0 \

python train_basic.py -c configs/mras_more_parameters.yaml --exp_name $exp_n --use_triton  --job_id $job_id --task_id $param  --num_workers $num_w --seed 1111 \
    --validation_checkpoint 7046468_0_mras_1111_10x10_moreparams_June15_triton_mras_fixed_1 --do_validation \
    --dataset 'mras' --fold 'fixed_1' --n_files_per_scene 10000000 --max_length 24000 --use_augmentation_getitem --fmap_use_soft_sources \
    --read_lmdb --fname_lmdb 'rirs_mono_mras_grids.lmdb' --read_lmdb_maps --fname_lmdb_maps 'mras_relfloor_10x10_moreparams.lmdb' --rir_output_channels 0 \

📄 License

SceneAcousticEstimation is licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
You are free to share and adapt the dataset, even for commercial use, as long as proper attribution is given.

See the LICENSE file for full terms.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Code		Code
Data		Data
.gitattributes		.gitattributes
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Scene-wide Acoustic Parameter Estimation

Citation

Overview

Data

Getting the data

Prerequisites

Creating the environment

Get the submodules

Training: Reproducing the Main Experiments

Key flags & expected files

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Languages

License

facebookresearch/SceneAcousticEstimation

Folders and files

Latest commit

History

Repository files navigation

Scene-wide Acoustic Parameter Estimation

Citation

Overview

Data

Getting the data

Prerequisites

Creating the environment

Get the submodules

Training: Reproducing the Main Experiments

Key flags & expected files

📄 License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Languages

Packages