OpenSW is a cross-platform desktop application for quick and efficient speech-to-text conversion. It leverages OpenAI Whisper for local transcription and optionally integrates with Ollama to refine transcribed text using LLMs.
- π€ Local Speech Recognition β Uses Whisper for on-device transcription (no cloud required)
- β‘ GPU Acceleration β CUDA on Windows, Metal on macOS for fast inference
- π€ LLM Text Refinement β Optional Ollama integration to clean up filler words and improve punctuation
- β¨οΈ Global Shortcut β Press
Ctrl+Alt+Spacefrom anywhere to start/stop recording - π Auto Clipboard β Transcribed text is automatically copied to clipboard
- π System Notifications β Get notified when transcription is complete
- π System Tray β Runs in background with easy access from tray icon
- π₯οΈ Compact Recording Mode β Minimal floating window during recording
| Recording | Transcribing | Refining | Copied |
|---|---|---|---|
![]() |
![]() |
![]() |
![]() |
Pre-built binaries are available for Windows and macOS (Apple Silicon):
| Platform | File |
|---|---|
| Windows (exe) | OpenSW.exe |
| Windows (msi) | OpenSW_0.1.0_x64_en-US.msi |
| macOS (Apple Silicon) | OpenSW_0.1.0_aarch64.dmg |
- Bun (or npm/yarn)
- Rust (1.70+)
- Whisper GGML Model (any size: tiny, base, small, medium, large)
Windows:
- Visual Studio Build Tools 2019+
- CUDA Toolkit (recommended for GPU acceleration)
macOS:
- Xcode Command Line Tools
- Metal is used automatically for GPU acceleration
Linux:
- Standard development tools (
build-essential, etc.) - CUDA Toolkit (for GPU acceleration)
# Clone the repository
git clone https://github.com/liebe-magi/OpenSW.git
cd OpenSW
# Install dependencies
bun install
# Run in development mode
bun run tauri dev
# Build for production (sets platform-specific environment variables automatically)
bun run tauri:buildDownload a Whisper GGML model from:
π https://huggingface.co/ggerganov/whisper.cpp/tree/main
| Model | Size | Accuracy | Speed |
|---|---|---|---|
ggml-tiny.bin |
~75 MB | Low | Fastest |
ggml-base.bin |
~142 MB | Medium | Fast |
ggml-small.bin |
~466 MB | Good | Moderate |
ggml-medium.bin |
~1.5 GB | High | Slow |
ggml-large-v3-turbo.bin |
~1.6 GB | High | Moderate |
ggml-large-v3.bin |
~3 GB | Highest | Slowest |
Tip: For Japanese transcription,
ggml-medium.binor larger is recommended for best accuracy.
-
Select a Whisper model β On first launch, click "Select" to choose your downloaded Whisper GGML model file (
.bin). -
Configure audio input β Select your preferred microphone from the dropdown.
-
Start recording β Press
Ctrl+Alt+Spaceor click the tray icon. -
Stop recording β Press
Ctrl+Alt+Spaceagain. The audio will be transcribed and copied to your clipboard.
To enable LLM-based text refinement:
- Install and run Ollama
- Pull a model (e.g.,
ollama pull llama3.2) - In OpenSW, configure the Ollama settings:
- URL:
http://localhost:11434(default) - Model: Select your installed model
- Prompt: Customize the refinement prompt
- URL:
If you see a message saying "OpenSW.app is damaged and can't be opened" when trying to run the app, this is due to macOS Gatekeeper security settings (because the app is not notarized by Apple).
Solution:
Run the following command in Terminal to remove the quarantine attribute:
xattr -cr /Applications/OpenSW.app(Adjust the path if you installed the app somewhere else)
All settings are stored locally and persist across sessions:
| Setting | Description |
|---|---|
| Audio Device | Select input microphone |
| Whisper Model | Path to GGML model file |
| Language | Transcription language (Japanese/English) |
| Ollama URL | Ollama server address |
| Ollama Model | LLM model for text refinement |
| Prompt Template | Custom prompt for refinement |
- Frontend: React 18, TypeScript, Vite
- Backend: Rust, Tauri 2.0
- Speech Recognition: whisper-rs (whisper.cpp bindings)
- Audio: cpal, hound, rodio
- LLM Integration: Ollama API via reqwest
OpenSW includes built-in auto-update functionality. The distributed binaries are signed for secure updates.
If you build from source, signing is optional:
- Without signing: Development builds work normally (
bun run tauri dev) - With signing: Required only for distributing signed releases with auto-update
# Generate signing keys
bunx tauri signer generate -w ~/.tauri/opensw.key
# Copy .env.example to .env.local and configure
cp .env.example .env.local
# Edit .env.local with your key path and password
# Build with signing
bun run tauri:build
# Generate latest.json for release
bun run release:prepareNote: The public key in
tauri.conf.jsonis used to verify updates. If you fork this project and want auto-updates, you'll need to generate your own key pair and update the public key.
OpenSW/
βββ src/ # React frontend
β βββ components/ # UI components
β βββ App.tsx
βββ src-tauri/ # Rust backend
β βββ src/
β β βββ main.rs # Application entry point
β β βββ audio.rs # Audio recording/playback
β β βββ ollama.rs # Ollama API client
β β βββ clipboard.rs # Clipboard operations
β β βββ tray.rs # System tray setup
β βββ Cargo.toml
βββ package.json
# Development
bun run dev # Start Vite dev server
bun run tauri dev # Run Tauri in development mode
# Build
bun run build # Build frontend
bun run tauri:build # Build distributable (with signing if configured)
bun run release:prepare # Generate latest.json for updater
# Code Quality
bun run lint # Run ESLint
bun run format # Format with PrettierContributions are welcome! Please feel free to submit issues and pull requests.
This project is dual-licensed under either:
at your option.
- OpenAI Whisper β Speech recognition model
- whisper.cpp β Lightweight Whisper implementation
- Tauri β Cross-platform desktop framework
- Ollama β Local LLM runtime





