π€ AI-powered coding assistant for your terminal
Kimi Β· GLM Β· MiniMax Β· DeepSeek Β· Ollama β 100% open source
Install β’ Quick Start β’ Why Gokin? β’ Features β’ Providers β’ Config β’ Contribute
Most AI coding tools are closed-source, route your code through third-party servers, and give you zero control over what gets sent to the model. Gokin was built with a different goal: a fast, secure, zero-telemetry CLI where your code goes directly to the provider you chose β and nothing else leaves your machine.
Gokin focuses on a small, well-tested set of providers: Kimi, GLM, MiniMax, DeepSeek (via Anthropic-compatible APIs) and Ollama (fully local). Secrets, credentials, and sensitive code are automatically redacted before reaching any model, TLS is enforced on every connection, and no proxy or middleware ever touches your data.
| Feature | Gokin | Claude Code | Cursor |
|---|---|---|---|
| Price | Free β Pay-per-use | $20+/month | $20+/month |
| Providers | 5 (Kimi, GLM, MiniMax, DeepSeek, Ollama) | 1 (Claude) | Multi |
| Offline | β Ollama | β | β |
| 54 Tools | β | ~30 | ~30 |
| Multi-agent | β 5 parallel | Basic | β |
| Direct API | β Zero proxies | β | β Routes through Cursor servers |
| Security | β TLS 1.2+, secret redaction (24 patterns), sandbox, 3-level permissions | Basic | Basic |
| Open Source | β | β | β |
| Self-hosting | β | β | β |
Choose your stack:
| Stack | Cost | Best For |
|---|---|---|
| Gokin + Kimi Coding Plan β | Subscription | Default β Kimi K2.6, 262K context, thinking + tool use, coding-tuned |
| Gokin + DeepSeek V4 β | Pay-per-use | Recommended as of v0.71 β 1M context, top SWE-bench reasoning, ~20Γ cheaper than Opus, prompt caching enabled |
| Gokin + GLM Coding Plan β | ~$3/month | Budget-friendly daily coding, GLM-5/5.1 with thinking |
| Gokin + MiniMax β | Pay-per-use | 200K context, strong on agentic coding |
| Gokin + Ollama | π Free | Privacy, offline, no API costs |
All four cloud providers are actively recommended β they're the daily-driver tier gokin is tested against every release.
curl -fsSL https://raw.githubusercontent.com/ginkida/gokin/main/install.sh | shgit clone https://github.com/ginkida/gokin.git
cd gokin
go build -o gokin ./cmd/gokin
./gokin --setup- Go 1.25+ (build from source)
- One AI provider (see Providers below)
# Launch with interactive setup (picks provider + key)
gokin --setup
# Or set an API key and run β Kimi Coding Plan is the v0.69+ default
export GOKIN_KIMI_KEY="sk-kimi-..."
gokin
# Prefer another provider? Any of these also works out of the box:
# export GOKIN_DEEPSEEK_KEY="..." # DeepSeek V4 (recommended β 1M ctx + cheap)
# export GOKIN_GLM_KEY="..." # GLM Coding Plan
# export GOKIN_MINIMAX_KEY="..." # MiniMax
# (Ollama needs no key β just run a local model)Then just talk naturally:
> Explain how auth works in this project
> Add user registration endpoint with validation
> Run the tests and fix any failures
> Refactor this module to use dependency injection
> Create a PR for these changes
- Multi-file Analysis β Understand entire modules via grep + glob + read
- Session Memory β Auto-summarizes your session (files, tools, errors, decisions). Survives context compaction. Optional LLM-based summarization every 3rd extraction.
- Context-aware agents β Read-only tools run in parallel, write tools serialized
Priority: Low ββββββββββββββββββββββββββββββββ High
Global β User β Project β Local
Global: ~/.config/gokin/GOKIN.md
User: ~/.gokin/GOKIN.md
Project: ./GOKIN.md, .gokin/rules/*.md
Local: ./GOKIN.local.md (git-ignored)
- All layers merged automatically
@includedirective:@./path,@~/path,@/absolute/path- File watching with auto-reload on changes
- Files: read, write, edit, diff, batch, copy, move, delete
- Search: glob, grep, tree
- Git: status, commit, diff, branch, log, blame, PR
- Run: bash, run_tests, ssh, env
- Plan: todo, task, enter_plan_mode, coordinate
- Memory: memorize, shared_memory, pin_context
- MCP servers: add your own via
/mcp add(Model Context Protocol, stdio + http transports, per-server permissions) - Parallel execution: Read-only tools (read, grep, glob) run in parallel when model calls multiple
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β Explore ββββββΆβ General ββββββΆβ Bash β
β (read) β β (write) β β (execute) β
βββββββββββββββ βββββββββββββββ βββββββββββββββ
β β β
βββββββββββββββββββββ΄ββββββββββββββββββββ
β
[Progress UI]
- Up to 5 parallel agents
- Shared memory between agents
- Automatic task decomposition
- API retry with exponential backoff β agents survive transient API errors (rate limits, timeouts, 500s)
- Provider failover β agents automatically try fallback providers when primary fails
- Real-time streaming β agent output streamed to UI as it's generated
- Git worktree support β parallel branch work with isolated sessions
- Per-model pricing for Kimi, DeepSeek, GLM, MiniMax models (Ollama is free)
- Real-time cost in status bar (
$0.0243) - Per-response cost in message footer
/costand/statscommands with accurate model-specific pricing
- Explicit
cache_controlbreakpoints for Kimi, MiniMax, and DeepSeek (GLM currently doesn't honour cache_control server-side, so we skip markers there) - System prompt, tools, and conversation prefix cached β up to 90% input cost savings
- Cache break detection with efficiency tracking
- Full multi-turn support for Kimi K2.6 / GLM / Anthropic-style reasoning
- Thinking blocks (with
signature) preserved across turns, including tool calls - Signature-aware history reconstruction β no "reasoning_content missing" errors
- 3-level permissions: Low (auto), Medium (ask once), High (always ask)
- Sandbox mode for bash commands
- Diff preview before applying changes (single-file and multi-file)
- Undo/Redo for all file operations
- Audit logging
- Proactive context compaction β predicts token growth and compacts before hitting model limits
ββββββββββββ ββββββββββββββββββββββββ
β Gokin β ββTLSβββΆ β Provider API β
β (local) β β (Kimi / Z.AI / ...) β
β β βββTLSββ β β
ββββββββββββ ββββββββββββββββββββββββ
No middle servers. No Vercel. No telemetry proxies.
Your API key, your code, your conversation β direct.
Some CLI tools route requests through their own proxy servers (Vercel Edge, custom gateways) for telemetry, analytics, or API key management. Gokin does none of this. Every API call goes directly from your machine to the provider's endpoint. You can verify this β it's open source.
LLM tool calls can accidentally expose secrets found in your codebase. Gokin automatically redacts them before they reach the model or your terminal:
| Category | Examples |
|---|---|
| API keys | AKIA..., ghp_..., sk_live_..., AIza... |
| Tokens | Bearer tokens, JWT (eyJ...), Slack/Discord tokens |
| Credentials | Database URIs (postgres://user:pass@...), Redis, MongoDB |
| Crypto material | PEM private keys, SSH keys |
24 regex patterns, applied to every tool result and audit log. Handles any data type β strings, maps, typed slices, structs. Custom patterns supported via API.
| Layer | What it does |
|---|---|
| TLS 1.2+ enforced | No weak ciphers, certificate verification always on |
| Sandbox mode | Bash runs in isolated namespace (Linux), safe env whitelist (~35 vars) β API keys never leak to subprocesses |
| Command validation | 50+ blocked patterns: fork bombs, reverse shells, rm -rf /, credential theft, env injection |
| SSH validation | Host allowlist, loopback blocked, username injection prevention |
| Path validation | Symlink resolution, directory traversal blocked, TOCTOU prevention |
| SSRF protection | Private IPs, loopback, link-local blocked; all DNS results checked |
| Audit trail | Every tool call logged with sanitized args |
- API keys loaded from env vars or local config (
~/.config/gokin/config.yaml) - Keys are masked in all UI displays (
sk-12****cdef) - Keys are never included in conversation history or tool results
- Ollama mode: zero network calls β fully airgapped
> Remember we use PostgreSQL with pgx driver
> What were our database conventions?
- Project-specific memories
- Auto-inject relevant context
- Stored locally (your data stays yours)
Important
Recommended providers β daily-driver tier, tested every release:
- Kimi Coding Plan (Kimi K2.6) β default as of v0.69, 262K context, thinking + tools
- DeepSeek V4 (Pro / Flash) β added in v0.71, strongly recommended, 1M context, top SWE-bench reasoning, ~20Γ cheaper than Opus, prompt caching (95% savings on repeat prefixes)
- GLM Coding Plan (GLM-5 / GLM-5.1) β budget option (~$3/month), thinking supported
- MiniMax (M2.7 / M2.5) β 200K context, strong on agentic coding
Ollama is fully supported for offline / zero-cost workflows.
| Provider | Models | Endpoint | Notes |
|---|---|---|---|
| Kimi β | kimi-for-coding (K2.6) |
api.kimi.com/coding |
Default. Coding Plan subscription; 262K context, reasoning + vision + video, thinking mode |
| DeepSeek β | deepseek-v4-pro, deepseek-v4-flash, deepseek-chat, deepseek-reasoner |
api.deepseek.com/anthropic |
Recommended. V4 ships with 1M context, extended thinking, and Anthropic prompt caching (live-verified 95% savings). Strong-tier SWE-bench reasoning at ~$0.55/$2.19 per 1M tokens for Pro (Flash is $0.27/$1.10). |
| GLM β | glm-5.1, glm-5, glm-4.7 |
api.z.ai/api/anthropic |
Budget-friendly Coding Plan, thinking mode |
| MiniMax β | MiniMax-M2.7, M2.7-highspeed, M2.5, M2.5-highspeed |
api.minimax.io/anthropic |
200K context, strong on agentic coding |
| Ollama | Any local model (llama3.2, qwen2.5-coder, ...) |
localhost:11434 |
100% offline, no network calls |
All cloud providers use Anthropic-compatible APIs and share the same client (internal/client/anthropic.go) β fewer moving parts, consistent behavior. Kimi auth uses Bearer tokens; GLM, MiniMax, and DeepSeek accept both Bearer and x-api-key. Ollama uses its own native client.
Switch anytime:
> /provider kimi
> /model kimi-for-coding
> /provider deepseek
> /model deepseek-v4-pro
> /provider glm
> /model glm-5.1
> /provider minimax
> /model MiniMax-M2.7
> /provider ollama
> /model llama3.2
Gokin's kimi provider points at Kimi Coding Plan (api.kimi.com/coding) by default β that's where kimi-for-coding lives. If you have a Moonshot Developer API key instead, set model.custom_base_url: https://api.moonshot.ai/anthropic in your config; gokin will route through the Developer endpoint using your key. Legacy model names (kimi-k2.5, kimi-k2-thinking-turbo, kimi-k2-turbo-preview) are silently migrated to kimi-for-coding on load β UNLESS custom_base_url is set, in which case gokin respects the name you picked.
| Command | Description |
|---|---|
/login <provider> <key> |
Set API key |
/provider <name> |
Switch provider |
/model <name> |
Switch model |
/mcp [list|add|status|remove] |
Manage MCP servers (Model Context Protocol) |
/plan |
Enter planning mode |
/save / /load |
Session management |
/commit [-m "msg"] |
Git commit |
/pr --title "..." |
Create GitHub PR |
/undo [N] |
Undo last N file changes (max 20) |
/stats |
Session statistics incl. per-provider token/cost |
/theme |
Switch UI theme |
/help |
Show all commands (55+ available) |
| Key | Action |
|---|---|
Enter |
Send message |
Ctrl+C |
Interrupt |
Ctrl+P |
Command palette |
β/β |
History |
Tab |
Autocomplete |
? |
Show help |
Location: ~/.config/gokin/config.yaml
api:
kimi_key: "sk-kimi-..."
active_provider: "kimi"
model:
name: "kimi-for-coding"Prefer DeepSeek, GLM, or MiniMax? Just swap:
# DeepSeek V4 (recommended β 1M context, prompt caching, cheap)
api: { deepseek_key: "sk-...", active_provider: "deepseek" }
model: { name: "deepseek-v4-pro" }
# GLM
api: { glm_key: "...", active_provider: "glm" }
model: { name: "glm-5" }
# MiniMax
api: { minimax_key: "...", active_provider: "minimax" }
model: { name: "MiniMax-M2.7" }api:
kimi_key: "" # Kimi Coding Plan key (sk-kimi-...)
deepseek_key: "" # DeepSeek V4 key (sk-...)
glm_key: ""
minimax_key: ""
ollama_key: "" # optional, only for remote Ollama with auth
active_provider: "kimi" # kimi | deepseek | glm | minimax | ollama
ollama_base_url: "http://localhost:11434"
retry:
max_retries: 10
retry_delay: 1s
http_timeout: 120s
stream_idle_timeout: 30s
providers:
kimi:
http_timeout: 5m
stream_idle_timeout: 120s
glm:
http_timeout: 5m
stream_idle_timeout: 180s
minimax:
http_timeout: 5m
stream_idle_timeout: 120s
deepseek:
http_timeout: 5m
stream_idle_timeout: 120s
model:
name: "kimi-for-coding" # Kimi K2.6 (Coding Plan)
temperature: 0.6
max_output_tokens: 32768
custom_base_url: "" # override endpoint (e.g. Moonshot Dev API)
enable_thinking: false # Extended thinking β supported on Kimi, DeepSeek (V4 + reasoner), GLM, MiniMax
thinking_budget: 0 # 0 = provider default (Kimi, GLM 4.7+/5.x)
force_weak_optimizations: false # opt Strong-tier models into weak-tier safeguards
tools:
timeout: 2m
model_round_timeout: 5m
bash:
sandbox: true
allowed_dirs: []
permission:
enabled: true
default_policy: "ask" # allow, ask, deny
plan:
enabled: true
require_approval: true
ui:
theme: "dark" # dark, macos, light
stream_output: true
markdown_rendering: truegokin/
βββ cmd/gokin/ # CLI entry point
βββ internal/
β βββ app/ # Orchestrator & message loop
β βββ agent/ # Multi-agent system
β βββ client/ # AnthropicClient (compat: Kimi/GLM/MiniMax) + OllamaClient
β βββ tools/ # 54 built-in tools
β βββ mcp/ # MCP (Model Context Protocol) client + manager
β βββ ui/ # Bubble Tea TUI
β βββ config/ # YAML config
β βββ permission/ # 3-level security + per-MCP-server isolation
β βββ memory/ # Persistent memory
β βββ ...
~120K LOC β’ 100% Go β’ Production-ready
Contributions welcome! See CONTRIBUTING.md for:
- Development setup
- Code style guide
- Pull request process
# Dev setup
git clone https://github.com/ginkida/gokin.git
cd gokin
go mod download
go build -o gokin ./cmd/gokin
# Test
go test -race ./...
# Format
go fmt ./...
go vet ./...MIT β Use freely, modify, distribute.
- Bubble Tea β TUI framework
- Lipgloss β Terminal styling
- Ollama β Local LLM runtime
Made with β€οΈ by developers, for developers

