OpenSW

Open-source Speech-to-Text Desktop Application

Overview

OpenSW is a cross-platform desktop application for quick and efficient speech-to-text conversion. It leverages OpenAI Whisper for local transcription and optionally integrates with Ollama to refine transcribed text using LLMs.

Key Features

🎤 Local Speech Recognition – Uses Whisper for on-device transcription (no cloud required)
⚡ GPU Acceleration – CUDA on Windows, Metal on macOS for fast inference
🤖 LLM Text Refinement – Optional Ollama integration to clean up filler words and improve punctuation
⌨️ Global Shortcut – Press Ctrl+Alt+Space from anywhere to start/stop recording
📋 Auto Clipboard – Transcribed text is automatically copied to clipboard
🔔 System Notifications – Get notified when transcription is complete
📍 System Tray – Runs in background with easy access from tray icon
🖥️ Compact Recording Mode – Minimal floating window during recording

Screenshots

Main Window

Recording Workflow

Recording	Transcribing	Refining	Copied

Installation

Download Pre-built Binaries

Pre-built binaries are available for Windows and macOS (Apple Silicon):

👉 Download from Releases

Platform	File
Windows (exe)	`OpenSW.exe`
Windows (msi)	`OpenSW_0.1.0_x64_en-US.msi`
macOS (Apple Silicon)	`OpenSW_0.1.0_aarch64.dmg`

Build from Source

Prerequisites

Bun (or npm/yarn)
Rust (1.70+)
Whisper GGML Model (any size: tiny, base, small, medium, large)

Platform-Specific Requirements

Windows:

Visual Studio Build Tools 2019+
CUDA Toolkit (recommended for GPU acceleration)

macOS:

Xcode Command Line Tools
Metal is used automatically for GPU acceleration

Linux:

Standard development tools (build-essential, etc.)
CUDA Toolkit (for GPU acceleration)

Build Commands

# Clone the repository
git clone https://github.com/liebe-magi/OpenSW.git
cd OpenSW

# Install dependencies
bun install

# Run in development mode
bun run tauri dev

# Build for production (sets platform-specific environment variables automatically)
bun run tauri:build

Download Whisper Model

Download a Whisper GGML model from:

👉 https://huggingface.co/ggerganov/whisper.cpp/tree/main

Model	Size	Accuracy	Speed
`ggml-tiny.bin`	~75 MB	Low	Fastest
`ggml-base.bin`	~142 MB	Medium	Fast
`ggml-small.bin`	~466 MB	Good	Moderate
`ggml-medium.bin`	~1.5 GB	High	Slow
`ggml-large-v3-turbo.bin`	~1.6 GB	High	Moderate
`ggml-large-v3.bin`	~3 GB	Highest	Slowest

Tip: For Japanese transcription, ggml-medium.bin or larger is recommended for best accuracy.

Usage

Quick Start

Select a Whisper model – On first launch, click "Select" to choose your downloaded Whisper GGML model file (.bin).
Configure audio input – Select your preferred microphone from the dropdown.
Start recording – Press Ctrl+Alt+Space or click the tray icon.
Stop recording – Press Ctrl+Alt+Space again. The audio will be transcribed and copied to your clipboard.

Optional: Ollama Integration

To enable LLM-based text refinement:

Install and run Ollama
Pull a model (e.g., ollama pull llama3.2)
In OpenSW, configure the Ollama settings:
- URL: http://localhost:11434 (default)
- Model: Select your installed model
- Prompt: Customize the refinement prompt

Troubleshooting

macOS: "App is damaged and can't be opened"

If you see a message saying "OpenSW.app is damaged and can't be opened" when trying to run the app, this is due to macOS Gatekeeper security settings (because the app is not notarized by Apple).

Solution:

Run the following command in Terminal to remove the quarantine attribute:

xattr -cr /Applications/OpenSW.app

(Adjust the path if you installed the app somewhere else)

Configuration

All settings are stored locally and persist across sessions:

Setting	Description
Audio Device	Select input microphone
Whisper Model	Path to GGML model file
Language	Transcription language (Japanese/English)
Ollama URL	Ollama server address
Ollama Model	LLM model for text refinement
Prompt Template	Custom prompt for refinement

Tech Stack

Frontend: React 18, TypeScript, Vite
Backend: Rust, Tauri 2.0
Speech Recognition: whisper-rs (whisper.cpp bindings)
Audio: cpal, hound, rodio
LLM Integration: Ollama API via reqwest

Signing & Auto-Update

OpenSW includes built-in auto-update functionality. The distributed binaries are signed for secure updates.

For Contributors / Self-Builders

If you build from source, signing is optional:

Without signing: Development builds work normally (bun run tauri dev)
With signing: Required only for distributing signed releases with auto-update

Setting Up Signing (Maintainers Only)

# Generate signing keys
bunx tauri signer generate -w ~/.tauri/opensw.key

# Copy .env.example to .env.local and configure
cp .env.example .env.local
# Edit .env.local with your key path and password

# Build with signing
bun run tauri:build

# Generate latest.json for release
bun run release:prepare

Note: The public key in tauri.conf.json is used to verify updates. If you fork this project and want auto-updates, you'll need to generate your own key pair and update the public key.

Development

Project Structure

OpenSW/
├── src/                    # React frontend
│   ├── components/         # UI components
│   └── App.tsx
├── src-tauri/              # Rust backend
│   ├── src/
│   │   ├── main.rs         # Application entry point
│   │   ├── audio.rs        # Audio recording/playback
│   │   ├── ollama.rs       # Ollama API client
│   │   ├── clipboard.rs    # Clipboard operations
│   │   └── tray.rs         # System tray setup
│   └── Cargo.toml
└── package.json

Commands

# Development
bun run dev          # Start Vite dev server
bun run tauri dev    # Run Tauri in development mode

# Build
bun run build           # Build frontend
bun run tauri:build     # Build distributable (with signing if configured)
bun run release:prepare # Generate latest.json for updater

# Code Quality
bun run lint         # Run ESLint
bun run format       # Format with Prettier

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

License

This project is dual-licensed under either:

at your option.

Acknowledgments

OpenAI Whisper – Speech recognition model
whisper.cpp – Lightweight Whisper implementation
Tauri – Cross-platform desktop framework
Ollama – Local LLM runtime

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github/workflows		.github/workflows
.vscode		.vscode
public		public
release		release
screenshots		screenshots
scripts		scripts
src-tauri		src-tauri
src		src
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
.prettierrc		.prettierrc
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
README.ja.md		README.ja.md
README.md		README.md
bun.lock		bun.lock
eslint.config.js		eslint.config.js
index.html		index.html
logo.png		logo.png
package.json		package.json
renovate.json		renovate.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Folders and files

Latest commit

History

Repository files navigation

OpenSW

Overview

Key Features

Screenshots

Main Window

Recording Workflow

Installation

Download Pre-built Binaries

Build from Source

Prerequisites

Platform-Specific Requirements

Build Commands

Download Whisper Model

Usage

Quick Start

Optional: Ollama Integration

Troubleshooting

macOS: "App is damaged and can't be opened"

Configuration

Tech Stack

Signing & Auto-Update

For Contributors / Self-Builders

Setting Up Signing (Maintainers Only)

Development

Project Structure

Commands

Contributing

License

Acknowledgments

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages