Skip to content

havenpersona/ossl-v1

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Video-Guided Text-to-Music Generation Using Public Domain Movie Collections (ISMIR 2025)

This repository provides the Python implementations of our proposed model architecture, which integrates a video adapter into MusicGen, introduced in our paper titled "Video-Guided Text-to-Music Generation Using Public Domain Movie Collections" from ISMIR 2025.

Model Architecture

If you find this repository useful for your research, please consider citing our paper.

@article{kim2025ossl,
  title = {Video-Guided Text-to-Music Generation Using Public Domain Movie Collections},
  author = {Haven Kim and Zachary Novack and Weihan Xu and Julian McAuley and Hao-Wen Dong},
  journal = {ISMIR 2025},
  year = {2025},
  url = {https://arxiv.org/abs/2506.12573}
}

Acknowledgements

Our implementation builds heavily on the official audiocraft repository.

Open Screen Sound Library Version 1 (OSSL-v1.)

Please see this webpage for downloading the dataset.

About

Video-Guided Text-to-Music Generation Using Public Domain Movie Collections (ISMIR 2025)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages