Skip to content
Nick edited this page Oct 31, 2025 · 3 revisions

Welcome to DocStripper Wiki

Welcome to the DocStripper documentation wiki! This wiki contains comprehensive guides, tutorials, and reference materials for using and contributing to DocStripper.

📚 Documentation Pages

🚀 Quick Links

📖 What is DocStripper?

DocStripper is an AI-powered batch document cleaner that automatically removes noise from text documents:

  • Page numbers - Lines with only digits (1, 2, 3...)
  • Headers/Footers - Common patterns like "Page X of Y", "Confidential"
  • Duplicate lines - Consecutive identical lines
  • Empty lines - Whitespace-only lines
  • Punctuation lines - Lines with only symbols (---, ***, ===)

Features

  • 🤖 Smart Clean (Beta) - AI-powered cleaning using on-device LLM
  • Fast Clean - Instant rule-based cleaning
  • 🔒 100% Private - All processing happens in your browser
  • 🌐 Web App - No installation required
  • 🖥️ CLI Tool - Command-line interface for batch processing

Getting Started

  1. Try it online: Visit https://kiku-jw.github.io/DocStripper/
  2. Read the Installation Guide: Installation
  3. Learn how to use it: Usage Guide
  4. Check out examples: See the Usage Guide for examples

Need Help?

  • Check the FAQ for common questions
  • Join Discussions
  • Open an Issue for bugs or feature requests

Made with ❤️ for clean documents

Clone this wiki locally