Don’t Just Read, Live the Story

Built in 8 hours at a hackathon, StoryScape transforms any PDF book into a living adventure with AI-generated visuals, adaptive soundscapes, and consistent characters.

Instead of reading static text, you can walk alongside characters, see their world unfold, and hear the emotions of every scene.

Features

PDF Input → Upload any book in PDF format.
Metadata Extraction (Gemini) → Detects characters, sections, and key story events.
Visual Consistency (Nano Banana) →
- Generates base images for each character once.
- Reuses them for consistency across all visuals.
Adaptive Soundscapes (ElevenLabs) → Short audio tracks matched to the mood of each scene.
Fast and Simple → Single Python script, no complex infra, minimal API calls.
Lightweight Storage → JSON + local images/audio files.

Tech Stack

google-genai → Gemini for text analysis and placement extraction
Nano Banana (Gemini Flash 2.5 image generation) → Character portraits and consistent visuals
ElevenLabs → Dynamic audio/music generation
PyPDF2 → PDF parsing
requests → API calls
python-dotenv → API key management

Project Structure

StoryScape/ ┣ script.py # Main system script (all logic in one file) ┣ outputs/ # Generated images and audio ┃ ┣ base_characters/ # Reference portraits ┃ ┣ visuals/ # Scene visuals ┃ ┗ music/ # Scene-specific audio ┣ characters.json # Extracted characters + references ┣ sections.json # Extracted sections + summaries ┗ requirements.txt # Dependencies

System Flow

Extract PDF text → Page-wise text stored in memory
Metadata Extraction (Gemini) → Characters and sections in JSON
Base Image Generation (Nano Banana) → Consistent character portraits
Placement Detection (Gemini) → Where to add visuals and music
Scene Visuals (Nano Banana) → Images using base portraits
Music Generation (ElevenLabs) → Scene-specific audio tracks
Compile Outputs → Per page: text, visuals, and music in structured JSON

Quick Start

Clone repo and install dependencies:
Add your API keys to .env:

GEMINI_API_KEY=your_key_here
NANO_BANANA_API_KEY=your_key_here
ELEVENLABS_API_KEY=your_key_here

Run the script with your PDF: python script.py --pdf your_book.pdf

Check outputs/ for generated images, audio, and the final compiled.json.

Requirements:

google-genai==1.33.0
PyPDF2==3.0.1
requests==2.32.5
python-dotenv==1.1.1

Hackathon Philosophy

All logic in one file (script.py)
Minimal steps, end-to-end in a single run
API calls only where needed
Focused on being demo-ready in under 48 hours

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.vscode		.vscode
book-reader-ui		book-reader-ui
generated_content		generated_content
.gitignore		.gitignore
11labs.md		11labs.md
alchemist.pdf		alchemist.pdf
image_generation.py		image_generation.py
plan.md		plan.md
readme.md		readme.md
requirements.txt		requirements.txt
script.py		script.py
test_elevenlabs.py		test_elevenlabs.py
test_music.mp3		test_music.mp3
test_music_generation.py		test_music_generation.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Don’t Just Read, Live the Story

Features

Tech Stack

Project Structure

System Flow

Quick Start

Requirements:

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

sumukhah/nano-banana-hackathon-project

Folders and files

Latest commit

History

Repository files navigation

Don’t Just Read, Live the Story

Features

Tech Stack

Project Structure

System Flow

Quick Start

Requirements:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages