|
| 1 | +--- |
| 2 | +title: "SSoC ’25 Week 02 Update by Muhammad Haroon" |
| 3 | +excerpt: "Setting up AudioGen locally and building a simple user interface using Streamlit for generating audio from text." |
| 4 | +category: "DEVELOPER NEWS" |
| 5 | +date: "2025-06-15" |
| 6 | +slug: "2025-06-15-ssoc-25-MuhammadHaroon-week02" |
| 7 | +author: "Muhammad Haroon" |
| 8 | +description: "SSoC'25 Contributor working on Generative AI Instrument Sample Generation for Music Blocks" |
| 9 | +tags: "ssoc25,sugarlabs,week02,GenAI,MusicBlocks,Music" |
| 10 | +image: "assets/Images/GSOC.png" |
| 11 | +--- |
| 12 | + |
| 13 | +<!-- markdownlint-disable --> |
| 14 | + |
| 15 | +# Week 02 Progress Report by Muhammad Haroon |
| 16 | + |
| 17 | +**Project:** [Generative AI Instrument Sample Generation for Music Blocks](https://github.com/sugarlabs/GSoC/blob/master/Ideas-2025.md#Generative-AI-Instrument-Sample-Generation-for-Music-Blocks) |
| 18 | +**Mentors:** [Walter Bender](https://github.com/walterbender), [Sumit Srivastava](https://github.com/sum2it) |
| 19 | +**Assisting Mentors:** [Devin Ulibarri](https://github.com/pikurasa) |
| 20 | +**Reporting Period:** 2025-06-09 - 2025-06-15 |
| 21 | + |
| 22 | +--- |
| 23 | + |
| 24 | +## Goals for This Week |
| 25 | + |
| 26 | +- **Goal 1:** Set up AudioGen locally. |
| 27 | +- **Goal 2:** Create a UI using streamlit. |
| 28 | + |
| 29 | +--- |
| 30 | + |
| 31 | +## This Week's Achievements |
| 32 | + |
| 33 | +1. **Set up AudioGen locally** |
| 34 | + - I was able to set up AudioGen locally for that I followed [AudioGen docs](https://github.com/facebookresearch/audiocraft/blob/main/docs/AUDIOGEN.md). I also created a virtual environment and a requirements.txt file to make the project easier to run. |
| 35 | + |
| 36 | +2. **Create a UI using streamlit** |
| 37 | + - I also created a UI using streamlit with the help of the [Streamlit docs](https://docs.streamlit.io/). |
| 38 | + |
| 39 | +--- |
| 40 | + |
| 41 | +## Challenges & How I Overcame Them |
| 42 | + |
| 43 | +- **Challenge:** The challenge I actually faced was due to limited resources. AudioCraft (which provides AudioGen) requires a GPU with at least 16 GB of memory for running inference with the medium-sized models (~1.5B parameters). For generating 5 minutes duration of audio, it took around 15-20 minutes. |
| 44 | +- **Solution:** I ran the model and used the waiting time to complete other tasks. I plan to deploy the model on AWS, where I expect significantly better performance. |
| 45 | + |
| 46 | +--- |
| 47 | + |
| 48 | +## Key Learnings |
| 49 | + |
| 50 | +- Gained familiarity with **Streamlit** |
| 51 | + |
| 52 | +--- |
| 53 | + |
| 54 | +## Next Week's Roadmap |
| 55 | + |
| 56 | +- Generate more samples using AudioGen and save them in Google Drive. |
| 57 | + |
| 58 | +--- |
| 59 | + |
| 60 | +## Acknowledgments |
| 61 | + |
| 62 | +Thank you to my mentors, the Sugar Labs community, and fellow GSoC contributors for ongoing support. |
| 63 | + |
| 64 | +--- |
| 65 | + |
| 66 | +## Connect with Me |
| 67 | + |
| 68 | +- GitHub: [@haroon10725](https://github.com/haroon10725) |
| 69 | + |
| 70 | +- LinkedIn: [Muhammad Haroon](https://www.linkedin.com/in/muhammad-haroon-7003b923b/) |
| 71 | + |
| 72 | +--- |
0 commit comments