lazy-ptt-enhancer

Voice-powered development workflows - Push-to-talk → Whisper transcription → AI enhancement → Instant feature specifications

Transform voice into detailed development specifications in seconds.

Press F12 → Speak your feature brief → Release → Get enhanced prompt with objectives, risks, acceptance criteria, and more.

🎯 What is This?

lazy-ptt-enhancer is a globally-installable voice-to-prompt toolkit that:

Captures your voice via push-to-talk (F12 default)
Transcribes locally with GPU-accelerated Whisper (offline capable)
Enhances with AI using OpenAI into structured specifications
Saves to your workspace - Prompts appear directly in project-management/prompts/
Works everywhere - Install once, use in any project directory

No copy-paste. No context switching. Just speak and code.

⚡ Quick Start (5 Minutes)

1. Install Globally

pip install lazy-ptt-enhancer

2. Initialize in Your Project

cd ~/my-awesome-project
lazy-ptt init

This will:

✅ Check dependencies (Python, audio devices, etc.)
✅ Create project-management/prompts/ directory
✅ Generate .env configuration template
✅ Download Whisper model (optional)

3. Configure API Key

# Edit .env file
OPENAI_API_KEY=sk-your-actual-api-key

4. Start Daemon (Always-On Mode)

lazy-ptt daemon --verbose-cycle

5. Use Voice Input Anytime

Press F12
Say: "Add user authentication with OAuth2 and session management"
Release F12

Result: Enhanced prompt saved to ./project-management/prompts/PROMPT-{timestamp}.md

# FEATURE Plan

**Summary**: Add user authentication with OAuth2 and session management

## Objectives
- Implement OAuth2 authentication flow
- Add JWT-based session management
- Create user profile management

## Acceptance Criteria
- [ ] Users can sign in with Google/GitHub
- [ ] Sessions persist across browser restarts
- [ ] Users can view and edit their profile

---

🎤 Generated with lazy-ptt-enhancer by @therouxe

🚀 Features

Core Features

✅ Global installation - Install once with pip, use anywhere
✅ Per-project initialization - lazy-ptt init in any directory
✅ Push-to-talk audio capture - F12 (configurable via CLI)
✅ Local Whisper transcription - GPU-accelerated, offline capable
✅ AI prompt enhancement - Structured output with objectives, risks, criteria
✅ Workspace-aware storage - Saves to current directory's project-management/prompts/
✅ Auto-move by default - No staging folder (configurable via --no-auto-move)
✅ Always-on daemon mode - Background process for any project
✅ Claude Code integration - Designed for plugin compatibility
✅ Branded output - Attribution to @therouxe in all generated prompts

Advanced Features

⚡ GPU acceleration - CUDA support for faster transcription
🌐 Multi-language - Transcribe in English, Spanish, French, German, etc.
🎛️ Fully configurable - Environment variables, CLI flags, or YAML config
🔒 Privacy-first - Whisper runs locally, only enhancement hits API
📊 Metadata tracking - JSON metadata alongside each prompt
🔌 REST API - FastAPI server for non-Python clients
🎚️ Device selection - Choose your microphone with lazy-ptt devices

📦 Installation

Prerequisites

Python 3.9+ (3.11+ recommended)
PortAudio (for audio capture)
- macOS: brew install portaudio
- Debian/Ubuntu: sudo apt-get install libportaudio2
- Windows: Included with pip packages
CUDA Toolkit (optional, for GPU acceleration)
OpenAI API Key (for prompt enhancement)

Install Package

# Using pip (recommended)
pip install lazy-ptt-enhancer

# Or using uv (faster)
uv pip install lazy-ptt-enhancer

# Verify installation
lazy-ptt --help

First-Time Setup in a Project

cd ~/my-project
lazy-ptt init

# This creates:
# - project-management/prompts/ directory
# - .lazy-ptt/staging/ directory
# - .env configuration template
# - Downloads Whisper model (optional)

Configure Environment

Edit the generated .env file:

# REQUIRED
OPENAI_API_KEY=sk-your-key

# OPTIONAL (defaults shown)
WHISPER_MODEL_SIZE=medium
WHISPER_DEVICE=auto
PTT_HOTKEY=<f12>

🎤 Usage

Mode 1: Always-On Daemon (Recommended)

Run once per work session:

lazy-ptt daemon --verbose-cycle

Then press F12 anytime to capture voice input in ANY directory.

Output:

🎤 Daemon started. Press <f12> to capture voice anytime.
Auto-move: ✅ ENABLED (saves to project-management)
Working directory: /home/user/my-project

[✅ project-management] Prompt: ./project-management/prompts/PROMPT-20251030.md (FEATURE)

Tip: The daemon works across all projects. Change directories and press F12 - prompts save to the new directory's project-management/.

Mode 2: Single Voice Capture

Capture one voice input and exit:

lazy-ptt listen

Press F12, speak, release F12.

Output:

Push-to-talk active. Hold the configured hotkey, speak, and release to process.
Prompt saved to: ./project-management/prompts/PROMPT-20251030-143022.md
✅ Prompt saved to project-management workspace (auto-move enabled)
Detected work type: FEATURE
Summary: Add payment processing with Stripe integration

Disable auto-move (keep in staging):

lazy-ptt listen --no-auto-move

Mode 3: Enhance Text Brief (No Voice)

Have a text brief already? Enhance it directly:

lazy-ptt enhance-text --text "Add payment processing with Stripe"

Or from a file:

lazy-ptt enhance-text --file brief.txt

Mode 4: Process Existing Audio File

Already have a recording?

lazy-ptt process-audio recording.wav

Supports: .wav, .mp3, .flac, .ogg

🔧 Configuration

CLI Flags (Highest Priority)

All settings configurable via CLI:

# Disable auto-move (keep in staging)
lazy-ptt listen --no-auto-move

# Custom story ID
lazy-ptt listen --story-id US-3.4 --story-title "User Authentication"

# Verbose logging
lazy-ptt daemon --verbose-cycle

Environment Variables (Medium Priority)

# Required
export OPENAI_API_KEY=sk-...

# Optional (defaults shown)
export WHISPER_MODEL_SIZE=medium  # tiny, base, small, medium, large
export WHISPER_DEVICE=auto        # auto, cpu, cuda
export PTT_HOTKEY="<f12>"
export PROJECT_MANAGEMENT_ROOT=./project-management
export PTT_OUTPUT_ROOT=./project-management/prompts

YAML Config (Lowest Priority)

Create .lazy-ptt.yaml in project root (optional):

openai:
  api_key: ${OPENAI_API_KEY}  # Reference env vars
  model: gpt-4
  temperature: 0.7

whisper:
  model_size: medium
  language: en
  device: auto

ptt:
  hotkey: "<f12>"
  output_root: project-management/prompts

paths:
  project_management_root: ./project-management

🎛️ CLI Reference

Commands

Command	Description
`lazy-ptt init`	Initialize lazy-ptt in current directory
`lazy-ptt listen`	Capture single voice input
`lazy-ptt enhance-text`	Enhance text brief (no voice)
`lazy-ptt process-audio`	Transcribe + enhance audio file
`lazy-ptt daemon`	Run always-on background listener
`lazy-ptt devices`	List available microphones
`lazy-ptt --help`	Show help message

Common Flags

--no-auto-move           # Keep in staging (auto-move is DEFAULT)
--story-id ID            # Override story ID (default: auto-generate)
--story-title "Title"    # Add story title metadata
--verbose                # Enable verbose logging
--verbose-cycle          # Log each daemon capture cycle
--no-download            # Skip Whisper model download (init only)

Examples

# Initialize in new project
cd ~/new-project
lazy-ptt init

# List available microphones
lazy-ptt devices

# Start daemon with verbose output
lazy-ptt daemon --verbose-cycle

# Capture voice with metadata
lazy-ptt listen --story-id US-3.4 --story-title "User Authentication"

# Enhance text brief
lazy-ptt enhance-text --text "Fix login timeout bug"

# Process pre-recorded audio
lazy-ptt process-audio demo.wav

# Keep prompt in staging (disable auto-move)
lazy-ptt listen --no-auto-move

🔌 Claude Code Integration

Pattern 1: Standalone Daemon (Simplest)

Terminal 1 (run once per session):

lazy-ptt daemon --verbose-cycle

Terminal 2 (use Claude Code):

cd ~/my-project
claude-code

# Voice workflow:
# 1. Press F12 anywhere, say "Add OAuth2 authentication"
# 2. Release F12
# 3. Prompt auto-saved to ./project-management/prompts/
# 4. In Claude Code: /lazy create-feature project-management/prompts/PROMPT-{timestamp}.md

Pattern 2: Plugin Command

Add to your plugin's .claude/commands/voice.md:

# /voice - Capture voice input

## Implementation

```bash
lazy-ptt listen --verbose

# Get the last prompt path
PROMPT_PATH=$(ls -t project-management/prompts/PROMPT-*.md | head -1)

echo "✅ Prompt saved to: $PROMPT_PATH"
echo "Next: /lazy create-feature $PROMPT_PATH"


Usage in Claude Code:
```bash
/voice
# → Press F12, speak
# → Prompt auto-saved
# → Follow suggested command to create feature

Pattern 3: Background Daemon (Production)

Run daemon as systemd service (Linux):

# Copy service file
sudo cp ops/systemd/lazy-ptt-daemon.service /etc/systemd/system/

# Edit paths and environment
sudo nano /etc/systemd/system/lazy-ptt-daemon.service

# Enable and start
sudo systemctl enable lazy-ptt-daemon
sudo systemctl start lazy-ptt-daemon
sudo systemctl status lazy-ptt-daemon

Or launchd (macOS):

# Copy plist
cp ops/launchd/io.lazy.ptt.daemon.plist ~/Library/LaunchAgents/

# Edit paths
nano ~/Library/LaunchAgents/io.lazy.ptt.daemon.plist

# Load and start
launchctl load ~/Library/LaunchAgents/io.lazy.ptt.daemon.plist
launchctl start io.lazy.ptt.daemon

See CLAUDE_CODE_INTEGRATION.md for complete integration guide.

🌐 REST API (Optional)

Run the API server:

lazy-ptt-api  # Serves on http://127.0.0.1:8000

Endpoints

# Enhance text
curl -X POST http://127.0.0.1:8000/enhance-text \
  -H 'Content-Type: application/json' \
  -d '{"text":"Add OAuth2 authentication"}' | jq .

# Process audio file
curl -X POST http://127.0.0.1:8000/process-audio \
  -F '[email protected]' | jq .

# Trigger PTT capture (requires active desktop session)
curl -X POST http://127.0.0.1:8000/listen-once | jq .

🛠️ Troubleshooting

Issue: `lazy-ptt` command not found

Solution:

pip install lazy-ptt-enhancer
which lazy-ptt  # Verify installation

Issue: No audio input detected

Solution:

# List available audio devices
lazy-ptt devices

# Select device by index
PTT_INPUT_DEVICE_INDEX=1 lazy-ptt listen

Issue: OpenAI API key not found

Solution:

# Set in environment
export OPENAI_API_KEY=sk-...

# Or create .env file
echo "OPENAI_API_KEY=sk-..." > .env

Issue: Whisper model download fails

Solution:

# Install faster-whisper
pip install faster-whisper

# Or skip download during init
lazy-ptt init --no-download

Issue: CUDA out of memory

Solution:

# Use smaller Whisper model
export WHISPER_MODEL_SIZE=small  # or base, tiny

# Or force CPU mode
export WHISPER_DEVICE=cpu

Issue: Prompts not saving to project-management

Solution:

# Check working directory
pwd

# Auto-move is DEFAULT, but verify:
lazy-ptt daemon --verbose-cycle  # Should show "Auto-move: ✅ ENABLED"

# If needed, re-initialize
lazy-ptt init

📖 Documentation

README.md (this file) - User guide and quick start
CLAUDE_CODE_INTEGRATION.md - Plugin integration patterns
DEV_SPEC.md - Development specification and roadmap
PROJECT_STATUS.md - Current implementation status
examples/EXAMPLE_OUTPUT.md - Sample branded output
docs/TROUBLESHOOTING.md - Detailed troubleshooting

🚦 Roadmap

v1.0.0 (Q1 2025) - Production Release ✅

✅ Global pip install workflow
✅ Per-project initialization (lazy-ptt init)
✅ Auto-move by default (configurable)
✅ Push-to-talk audio capture
✅ Local Whisper transcription
✅ AI prompt enhancement
✅ Always-on daemon mode
✅ REST API server
✅ Branding footer
✅ systemd/launchd service configs

v1.1.0 (Q2 2025) - Local Models

Local LLM support (Ollama, llama.cpp)
Custom enhancement profiles (security, marketing, etc.)
Profile hot-reload

v1.2.0 (Q2 2025) - Multi-Language

Multi-language transcription (auto-detect)
Multi-language enhancement (French, Spanish, German, etc.)

v2.0.0 (Q3 2025) - Desktop UI

Qt/Electron desktop app
Live audio levels + transcription preview
Session history browser
Visual configuration editor

🤝 Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

Development Setup

git clone https://github.com/MacroMan5/STT-Devellopement-Prompt-Enhancer.git
cd STT-Devellopement-Prompt-Enhancer
python -m venv .venv
source .venv/bin/activate
pip install -e ".[api,ui,stt]"
pytest tests/

Code Style

Formatter: Black (line length 100)
Linter: Ruff
Type Checker: Mypy (planned)
Docstrings: Google style

📄 License

MIT License - See LICENSE for details.

🙏 Acknowledgments

OpenAI Whisper - Fast, accurate speech recognition
faster-whisper - GPU-accelerated Whisper implementation
OpenAI API - Powerful prompt enhancement
Claude Code - AI-assisted development workflows

📞 Support

GitHub Issues: Report bugs or request features
Documentation: Complete guides
Twitter/X: @therouxe

lazy-ptt-enhancer - Voice-powered development workflows Created by @therouxe

⭐ Star on GitHub | 📖 Documentation | 🐛 Report Issues

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
config		config
docs		docs
examples		examples
ops		ops
scripts		scripts
src/lazy_ptt		src/lazy_ptt
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CLAUDE.md		CLAUDE.md
CLAUDE_CODE_INTEGRATION.md		CLAUDE_CODE_INTEGRATION.md
DEV_SPEC.md		DEV_SPEC.md
LICENSE		LICENSE
Makefile		Makefile
PROJECT_STATUS.md		PROJECT_STATUS.md
README.md		README.md
SPRINT-1.md		SPRINT-1.md
SPRINT-2.md		SPRINT-2.md
SPRINT-3.md		SPRINT-3.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

License

MacroMan5/STT-Devellopement-Prompt-Enhancer

Folders and files

Latest commit

History

Repository files navigation

lazy-ptt-enhancer

🎯 What is This?

⚡ Quick Start (5 Minutes)

1. Install Globally

2. Initialize in Your Project

3. Configure API Key

4. Start Daemon (Always-On Mode)

5. Use Voice Input Anytime

🚀 Features

Core Features

Advanced Features

📦 Installation

Prerequisites

Install Package

First-Time Setup in a Project

Configure Environment

🎤 Usage

Mode 1: Always-On Daemon (Recommended)

Mode 2: Single Voice Capture

Mode 3: Enhance Text Brief (No Voice)

Mode 4: Process Existing Audio File

🔧 Configuration

CLI Flags (Highest Priority)

Environment Variables (Medium Priority)

YAML Config (Lowest Priority)

🎛️ CLI Reference

Commands

Common Flags

Examples

🔌 Claude Code Integration

Pattern 1: Standalone Daemon (Simplest)

Pattern 2: Plugin Command

Pattern 3: Background Daemon (Production)

🌐 REST API (Optional)

Endpoints

🛠️ Troubleshooting

Issue: lazy-ptt command not found

Issue: No audio input detected

Issue: OpenAI API key not found

Issue: Whisper model download fails

Issue: CUDA out of memory

Issue: Prompts not saving to project-management

📖 Documentation

🚦 Roadmap

v1.0.0 (Q1 2025) - Production Release ✅

v1.1.0 (Q2 2025) - Local Models

v1.2.0 (Q2 2025) - Multi-Language

v2.0.0 (Q3 2025) - Desktop UI

🤝 Contributing

Development Setup

Code Style

📄 License

🙏 Acknowledgments

📞 Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Issue: `lazy-ptt` command not found

Packages