Releases: corvo007/MioSub
Releases · corvo007/MioSub
Release list
v3.1.7
修复
- 编辑器:修复过长的下拉菜单溢出屏幕底部、导致滚动条无法拖动的问题。「说话人」筛选菜单和每行的说话人选择器现在会根据触发按钮上下方的可用空间限制高度,超出部分在菜单内部滚动。
- 导入:导入单语字幕(例如只含译文的导出文件)时,MioSub 会询问这段文本应作为原文还是译文,而不再一律当作原文。重新导入只含译文的文件不会再把译文放进原文栏。双语文件的导入行为保持不变。
Fixes
- Editor: Fix long dropdown menus overflowing past the bottom of the screen, which left their scrollbar unreachable. The Speaker filter and the per-row speaker picker now cap their height to the space available above/below the trigger and scroll internally.
- Import: When importing a single-language subtitle (for example a translation-only export), MioSub now asks whether the text is the original or the translation, instead of always treating it as the original. Re-importing a translation-only file no longer puts the translation in the source column. Bilingual files import as before.
v3.1.6
修复
- 编辑器:修复下拉菜单和提示框被容器边缘裁切的问题——每行的说话人选择器(尤其是片段最后一行)、工具栏的「问题」/「说话人」筛选菜单,以及时间编辑校验提示框,现已通过 portal 渲染,不再被遮挡(Fixes LOCAL-L11)。
Fixes
- Editor: Fix dropdown menus and tooltips being clipped at their container edge — the per-row speaker picker (notably on a segment's last row), the Issues/Speaker filter menus in the toolbar, and the time-edit validation tooltip are now rendered in a portal so they are never cut off (Fixes LOCAL-L11).
v3.1.5
新功能
- 设置:新增分步骤自定义 Gemini 模型名称——在 设置 > 服务 中新增「自定义模型名称」弹窗,可分别为各处理步骤(精修、翻译、术语提取、说话人分析、批量校对)指定所用的 Gemini 模型。仅限 Gemini 系列(保存时校验);留空则使用默认值。
修复
- 本地 Whisper:当误将模型文件(如
.bin)选为 Whisper 可执行程序时,在预检阶段给出清晰、可操作的错误提示,而非稍后以晦涩的 spawn 错误失败(Fixes MIOSUB-5X)。 - 本地 Whisper:在冷启动瞬时失败时,短暂预热后重试一次版本/GPU 探测,提升版本与 GPU 支持检测的可靠性。
- 转写(OpenAI):将网络/连接失败(离线、DNS 被屏蔽、无代理)归类为可操作的「请检查网络/代理」错误,而非笼统的转写失败,避免长时间重试循环。
- 术语表:对所有术语查找做未定义值保护,避免术语缺失时崩溃(Fixes LOCAL-L10)。
Features
- Settings: Add per-step custom Gemini model names — a new "Custom Model Names" dialog under Settings > Services lets you override the Gemini model used for each pipeline step (refinement, translation, glossary extraction, speaker analysis, batch proofread). Gemini-series only (validated on save); blank fields fall back to the defaults.
Fixes
- Whisper (Local): Detect when a model file (e.g.
.bin) is mistakenly selected as the Whisper executable and show a clear, actionable error during preflight, instead of failing later with a cryptic spawn error (Fixes MIOSUB-5X). - Whisper (Local): Retry the binary version/GPU probe once after a brief warm-up on transient cold-start failures, improving the reliability of version and GPU-support detection.
- Transcription (OpenAI): Classify network/connection failures (offline, DNS blocked, no proxy) as an actionable "check your network/proxy" error instead of a generic transcription failure, avoiding long retry loops.
- Glossary: Guard all glossary term lookups against undefined values to prevent a crash when a term is missing (Fixes LOCAL-L10).
v3.1.4
修复
- 错误处理:修复第三方 API 代理返回数字错误码时导致的 TypeError 崩溃——
isTransientError()现已正确处理非字符串.code值(Fixes MIOSUB-54)。 - 人声分离:错误消息中包含完整 stderr 输出以便崩溃诊断,替换之前常捕获无关 GPU 枚举日志的「关键行提取」逻辑。
- Electron:消除二进制版本探测产生的 Sentry 噪音——yt-dlp、Whisper、CTC 对齐器和 BSRoformer 的启动版本检查现缓存结果、延长超时,并抑制已处理的超时错误上报(Fixes MIOSUB-D, MIOSUB-37, MIOSUB-49, MIOSUB-4K)。
Fixes
- Error Handling: Fix TypeError crash when third-party API relays return numeric error codes —
isTransientError()now handles non-string.codevalues correctly (Fixes MIOSUB-54). - Vocal Separation: Include full stderr output in error messages for crash diagnostics, replacing the previous "key line extraction" that often captured irrelevant GPU enumeration logs instead of actual errors.
- Electron: Silence binary version-probe noise in Sentry — yt-dlp, Whisper, CTC Aligner, and BSRoformer startup version checks now cache results, use longer timeouts, and suppress handled timeout errors from reporting (Fixes MIOSUB-D, MIOSUB-37, MIOSUB-49, MIOSUB-4K).
v3.1.3
新功能
- 设置:新增用户可配置代理设置——在设置 > 常规中支持系统代理(默认)、自定义代理(HTTP/HTTPS/SOCKS5 含认证)和直连(绕过所有代理)三种模式(仅桌面端)。包含连接测试按钮及延迟反馈。
修复
- 流水线:当 Whisper 所有片段均被反幻觉过滤器过滤时(如纯音乐音频),显示描述性错误消息而非「未知原因」(Fixes MIOSUB-1M)。
- 编辑器:生成出错时保持顶部操作栏可见,用户可直接重试而无需重新导入字幕。
- Whisper:清理 whisper.cpp JSON 输出中可能导致解析失败的控制字符。
- Whisper:在无 GPU 支持的设备上主动禁用 GPU 以避免初始化时硬崩溃,而非仅依赖崩溃后重试。
安全
- 代理:在日志输出中脱敏代理凭据,并使用结构化 URL 解析进行验证,拒绝格式错误的 URL 和不支持的协议。
Features
- Settings: Add user-configurable proxy settings — support System Proxy (default), Custom Proxy (HTTP/HTTPS/SOCKS5 with auth), and Direct (bypass all proxies) modes in Settings > General (desktop only). Includes connection test button with latency feedback.
Fixes
- Pipeline: Show descriptive error message when all Whisper segments are filtered by anti-hallucination (e.g., music-only audio), instead of "unknown reason" (Fixes MIOSUB-1M).
- Editor: Keep the top action bar visible on generation error so users can retry without re-importing subtitles.
- Whisper: Sanitize control characters in whisper.cpp JSON output that could cause parse failures.
- Whisper: Proactively disable GPU on machines without GPU support to avoid hard crashes during initialization, instead of relying solely on post-crash retry.
Security
- Proxy: Redact proxy credentials in log output and use structured URL parsing for validation, rejecting malformed URLs and unsupported protocols.
v3.1.2
修复
- 人声分离:修复分段模式(视频超过 30 分钟)下找不到输出文件的问题——使用 BSRoformer 标准输出中最后一个「Saved output stem」匹配(即最终合并输出),而非第一个(临时分段文件)。新增 glob 回退机制搜索临时目录中的输出文件(Fixes MIOSUB-4B)。
- 流水线:将 30 分钟硬超时替换为基于活动的 10 分钟无活动超时——每次收到进度更新时重置计时器,允许长视频在持续有进度的情况下无限期处理(Fixes MIOSUB-4D)。
- 音频:扩展 FFmpeg 安全路径处理,检测方括号、百分号、引号等 FFmpeg 视为 glob 模式的特殊字符,在已有的非 ASCII 检测之外增加覆盖(Fixes MIOSUB-48)。
Fixes
- Vocal Separation: Fix output file not found in segmented mode (videos > 30 min) — use the last "Saved output stem" match from BSRoformer stdout instead of the first, which was picking up a temporary segment file rather than the final merged output. Added glob fallback to search temp directories for output files (Fixes MIOSUB-4B).
- Pipeline: Replace 30-minute hard timeout with activity-based 10-minute inactivity timeout — the timer now resets on each progress update, allowing long videos to process indefinitely as long as progress is being made (Fixes MIOSUB-4D).
- Audio: Extend FFmpeg safe-path handling to detect special characters (brackets, percent signs, quotes) that FFmpeg interprets as glob patterns, in addition to existing non-ASCII detection (Fixes MIOSUB-48).
v3.1.1
修复
- 人声分离:修复分离成功后找不到输出文件的问题——从 BSRoformer 标准输出解析实际路径,不再硬编码
_stem_0.wav后缀。 - 人声分离:修复 Windows 上中日韩用户名导致文件写入失败的问题——输出文件使用 ASCII 安全临时目录,与输入路径的处理方式保持一致。
- 人声分离:规范化文件访问安全检查中的路径比较,正确处理 Windows 上的大小写和分隔符差异。
Fixes
- Vocal Separation: Fix output file not found after successful separation — parse the actual output path from BSRoformer stdout instead of hardcoding the
_stem_0.wavsuffix. - Vocal Separation: Fix file write failures for Windows users with CJK usernames — use ASCII-safe temp directory for output files, matching the existing handling for input paths.
- Vocal Separation: Normalize path comparison in file access security check to handle case and separator differences on Windows.
v3.1.0
新功能
- 人声分离:集成 BSRoformer 神经网络人声分离——在转录前自动分离人声与背景音乐/噪音,大幅提升音乐视频和嘈杂音频的转录准确率。
- 人声分离:新增 Apple Silicon (Metal) GPU 加速支持,macOS arm64 设备可使用 GPU 加速人声分离。
- 原生 VAD:从浏览器端 VAD 迁移至原生二进制,语音活动检测速度提升 10 倍以上,可靠支持长视频处理而无需将整个音频缓冲区加载到内存。
- 反幻觉:新增多层 Whisper 幻觉过滤器——涵盖中英日韩黑名单、重复清理(假名洪泛、短语循环)以及基于 Dice 系数的跨字幕去重。
- 更新日志弹窗:更新后首次启动自动展示版本说明,关于页新增「查看更新」按钮。
- 进度:完善端到端进度上报,新增人声分离百分比显示。
- LLM:统一各 LLM 适配器的用量元数据,实现一致的 Token 追踪。
修复
- 音频:修复 Windows 上 FFmpeg 和音频解码器无法处理非 ASCII 文件路径的问题,解决 CJK 用户名导致的 EILSEQ 错误。
- 流水线:在并行处理中正确传递中止信号以支持取消操作,防止信号量过度释放,修复 Gemini Token 用量累加错误。
- 流水线:新增 WAV 解析器边界检查、RFC 4180 CSV 解析以及 Whisper 片段文本空值保护。
- Electron:安全加固,采用异步 I/O 并确保资源正确清理。
- React:修复 memo 安全性、过期闭包、卸载守卫和选择器稳定性问题。
- 类型:统一接口定义、类型化 IPC 通信契约,修复配置/国际化不一致。
- 音频:改进人声分离与音频压缩的衔接流程。
重构
- 更新服务:将 997 行单体文件拆分为 7 个聚焦模块(类型、GitHub API、应用更新器、版本检查、安装器、IPC 处理、编排器)。
- 流水线:加固生成流程并清理 lint 警告。
- 二进制:提取共享的
detectBinaryVersion工具函数,供各二进制管理器复用。
Features
- Vocal Separation: Integrate BSRoformer neural vocal separation — automatically isolate vocals from background music/noise before transcription for dramatically better accuracy on music videos and noisy audio.
- Vocal Separation: Add Apple Silicon (Metal) GPU acceleration support for vocal separation on macOS arm64.
- Native VAD: Migrate from browser-based VAD to native binary for 10x+ faster voice activity detection, with reliable support for long videos without loading entire audio buffers into memory.
- Anti-Hallucination: Add multi-layer Whisper hallucination filter — curated blacklists for EN/CN/JP/KR, repetition cleaning (kana floods, phrase loops), and cross-subtitle deduplication using Dice coefficient matching.
- Changelog Modal: Auto-display release notes on first launch after update, with a "What's New" button in the About tab.
- Progress: Tighten end-to-end progress reporting with vocal separation percentage display.
- LLM: Normalize usage metadata across LLM adapters for consistent token tracking.
Fixes
- Audio: Handle non-ASCII file paths in FFmpeg and audio decoder on Windows, fixing EILSEQ errors for users with CJK usernames.
- Pipeline: Pass abort signal through parallel processing for proper cancellation, cap semaphore release to prevent over-release, and fix Gemini token usage aggregation.
- Pipeline: Add WAV parser bounds checking, RFC 4180 CSV parsing, and null safety for Whisper segment text.
- Electron: Security hardening with async I/O and proper resource cleanup.
- React: Fix memo safety, stale closures, unmount guards, and selector stability.
- Types: Unify interfaces, type the IPC contract, and fix config/i18n inconsistencies.
- Audio: Improve vocal separation and compression plumbing.
Refactor
- Update Service: Split 997-line monolith into 7 focused modules (types, GitHub API, app updater, version checker, installer, IPC handlers, orchestrator).
- Pipeline: Harden generation flow with lint cleanup.
- Binary: Extract shared
detectBinaryVersionutility for reuse across binary managers.
v3.0.20
修复
- macOS:将 QuickJS 引擎从 cosmopolitan APE 构建切换为 quickjs-ng 原生二进制,修复 macOS 上的代码签名失败问题(Ref: MIOSUB-38)。
- 术语表:防御 LLM 响应中未定义的术语表条目,避免术语合并时崩溃(Fixes MIOSUB-3A)。
- JSON 解析器:当第三方代理忽略
responseMimeType时自动剥离 JSON 响应中的 Markdown 代码围栏,并改进错误位置日志(Fixes MIOSUB-11)。 - 对齐器:为 CTC 对齐器的音频和模型路径添加
ensureAsciiSafePath()包装,修复非 ASCII 路径下的对齐失败(Fixes MIOSUB-3H)。
Fixes
- macOS: Switch QuickJS engine from cosmopolitan APE builds to quickjs-ng native binaries, fixing codesign failures on macOS (Ref: MIOSUB-38).
- Glossary: Guard against undefined glossary terms in LLM responses, preventing crashes during glossary merging (Fixes MIOSUB-3A).
- JSON Parser: Strip markdown code fences from JSON responses when third-party proxies ignore
responseMimeType, with improved error position logging (Fixes MIOSUB-11). - Aligner: Wrap CTC aligner audio and model paths with
ensureAsciiSafePath(), fixing alignment failures on non-ASCII paths (Fixes MIOSUB-3H).
v3.0.19
新功能
- 术语表:新增术语表「复制」按钮,并加宽弹窗和侧栏布局以减少名称截断。
修复
- Linux:修复 AppImage 上二进制更新静默失败的问题,改为写入可写覆盖目录而非只读 FUSE 挂载点(Fixes MIOSUB-36)。
- Linux:添加 Linux 动态链接器运行时所需的
libonnxruntime.so.1版本化 soname。 - 国际化:补充英语和日语语言包中缺失的 27 个翻译键。
Features
- Glossary: Added "Duplicate" button to glossary sidebar and widened the modal layout to reduce name truncation.
Fixes
- Linux: Fixed binary updates silently failing on AppImage by writing to a writable overlay directory instead of the read-only FUSE mount (Fixes MIOSUB-36).
- Linux: Added versioned
libonnxruntime.so.1soname required by the Linux dynamic linker at runtime. - i18n: Added 27 missing translation keys for English and Japanese locales.