FlowPaste MVP Plan & Setup

Minimum Viable User Flow

Opening the app: Users immediately see a large text editing area and a fixed bottom toolbar (Record/Fix/Polish/Undo/Focus).
Recording: Click "Record" to start; click again to stop. UI switches to "Transcribing". Upon completion, the transcribed text is appended to the editor (or at cursor position) predictably.
Fix/Polish: Select a text segment -> Click Fix/Polish -> Snapshot taken -> API call -> Selection replaced with result (or full text if no selection). A toast appears with an "Undo" button; clicking Undo restores the snapshot.
Error Handling: Displays reasons (network, auth, timeout) with a "Retry" entry. Editor content remains unchanged on failure. Processing states can be cancelled to restore interactivity.
Focus Mode: Full-screen editor, hiding extraneous elements, keeping only the toolbar and undo functionality.

Key Interface Structure (Wireframe)

Layout: Single page, clean background. Central area is a high-line-height, large-font textarea.
Bottom Toolbar: Record (Idle/Recording/Transcribing/Error), Fix, Polish, Undo (highlighted when available), Focus.
Status Feedback: Bottom-right toast/status bar showing "Transcribing...", "Fix Applied", "Timeout, Retry?", etc.
Interaction Rules: Priority on selection; if no selection, processes full text. For long text, Fix allows full processing with a warning; Polish strongly suggests segmentation.

API Integration Design

Base URL: https://space.ai-builders.com/backend
Auth: Header Authorization: Bearer ${VITE_AI_BUILDER_TOKEN} (set in .env).
Timeout: Fetch 30s (Transcription 60s); Network error/timeout auto-retries once (0.8s -> 1.6s backoff); manual retry for other cases. Use AbortController for cancellation.

Speech to Text

POST /v1/audio/transcriptions (multipart/form-data)
Params: audio_file (binary) or audio_url; optional language (e.g., en, zh-CN).
Success: text contains the transcript.
Failures: 401/403 (Token error), 422 (Params), Timeout/Network (Toast + Retry).
Insertion: At cursor position to ensure predictability.

Fix / Polish (Chat Completions)

POST /v1/chat/completions
Model: deepseek (or configured model).
Params: tool_choice: "none", stream: true (recommended for perceived speed).
Messages: System prompt defines behavior; User prompt contains selection or full text.
Process: Snapshot -> Call API -> Replace -> Toast "Applied" + Undo.

Prompt Engineering (System Prompts)

The app now supports bilingual prompts (English/Chinese) based on UI language selection.

Fix System Prompt (English)

You are a strict text correction interface.
Your output will directly replace the user's original text, so you must:
1. Output ONLY the corrected text content.
2. strictly FORBID any explanations, analysis, notes, correction remarks, or preambles.
3. strictly FORBID outputs like "Here is the corrected text:".
4. Maintain the original Markdown format.
5. If there are no typos, return the text exactly as is without any comments.

Polish System Prompt (English)

You are a professional text polishing interface.
Your output will directly replace the user's original text, so you must:
1. Output ONLY the polished text content.
2. strictly FORBID any explanations, evaluations, suggestions, polishing remarks, or preambles.
3. strictly FORBID outputs like "Here is the polished result:".
4. Maintain the original Markdown format.
5. Maintain a natural tone; do not over-embellish.

Long Text & Cost Strategy

Selection First: Always prioritize selected text.
Threshold: If no selection and text > ~8k chars, warn user: "Selecting a segment is recommended for speed and accuracy," but allow continuation.

Implementation Roadmap

Basic Shell: Editor + Toolbar UI, Focus Mode, Toasts, Undo Stack.
Recording State Machine: Record -> Transcribe -> Insert -> Retry logic.
Fix/Polish Flow: Snapshot, Selection/Full text handling, Stream parsing, Undo.
I18n: English/Chinese support with language toggle.
Stability: Error boundaries, Auto-retry, Abort signals.

Local Development

npm install
npm run dev

Default: http://localhost:5173/

Testing

E2E: npm run test:e2e (Playwright)
Tests cover recording states, transcription insertion, Fix/Polish replacement, and undo functionality.

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
functions/api		functions/api
public		public
src		src
test-results		test-results
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
community_post.md		community_post.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
playwright.config.ts		playwright.config.ts
server.js		server.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FlowPaste MVP Plan & Setup

Minimum Viable User Flow

Key Interface Structure (Wireframe)

API Integration Design

Speech to Text

Fix / Polish (Chat Completions)

Prompt Engineering (System Prompts)

Fix System Prompt (English)

Polish System Prompt (English)

Long Text & Cost Strategy

Implementation Roadmap

Local Development

Testing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FlowPaste MVP Plan & Setup

Minimum Viable User Flow

Key Interface Structure (Wireframe)

API Integration Design

Speech to Text

Fix / Polish (Chat Completions)

Prompt Engineering (System Prompts)

Fix System Prompt (English)

Polish System Prompt (English)

Long Text & Cost Strategy

Implementation Roadmap

Local Development

Testing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages