Skip to content

Commit 4b273d5

Browse files
committed
📝 Add 'How It Works' section and Telegram message
- Add detailed 'How It Works' section to website explaining the cleaning algorithm - Create Telegram message file explaining the tool's functionality - Add beautiful styling for the new section with step-by-step explanation - Include privacy note about browser-only processing
1 parent 0253329 commit 4b273d5

File tree

2 files changed

+69
-0
lines changed

2 files changed

+69
-0
lines changed

TELEGRAM_MESSAGE.txt

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
🧹 DocStripper — как он чистит документы?
2+
3+
Инструмент работает просто и эффективно:
4+
5+
1️⃣ **Читает файл** (TXT или DOCX)
6+
- Извлекает весь текст из документа
7+
8+
2️⃣ **Удаляет мусор построчно:**
9+
• Пустые строки
10+
• Номера страниц (только цифры: "1", "2", "3")
11+
• Заголовки/футеры ("Page 1 of 5", "Confidential", "DRAFT")
12+
• Последовательные дубликаты (если строка повторяется подряд)
13+
14+
3️⃣ **Возвращает чистый текст**
15+
- Без лишнего мусора
16+
- Только полезное содержимое
17+
18+
Всё это происходит прямо в браузере — файлы никуда не отправляются, максимальная приватность! 🔒
19+
20+
Попробуй: https://kiku-jw.github.io/DocStripper2/
21+

docs/index.html

Lines changed: 48 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,54 @@ <h2>Prefer Command Line?</h2>
159159
<p>You can also use DocStripper as a CLI tool. Check out our <a href="https://github.com/kiku-jw/DocStripper2#installation" target="_blank">GitHub repository</a> for installation instructions.</p>
160160
</div>
161161
</section>
162+
163+
<!-- How It Works Section -->
164+
<section class="how-it-works">
165+
<div class="container">
166+
<h2>How It Works</h2>
167+
<div class="how-it-works-content">
168+
<p class="how-it-works-intro">
169+
DocStripper uses a simple but effective line-by-line cleaning algorithm to remove noise from your documents:
170+
</p>
171+
172+
<div class="how-it-works-steps">
173+
<div class="how-it-works-step">
174+
<div class="step-number">1</div>
175+
<div class="step-content">
176+
<h3>Read & Extract</h3>
177+
<p>The tool reads your file (TXT or DOCX) and extracts all text content. For DOCX files, it extracts text from the document structure.</p>
178+
</div>
179+
</div>
180+
181+
<div class="how-it-works-step">
182+
<div class="step-number">2</div>
183+
<div class="step-content">
184+
<h3>Line-by-Line Analysis</h3>
185+
<p>Each line is analyzed and filtered based on several criteria:</p>
186+
<ul class="step-list">
187+
<li><strong>Empty lines</strong> — Removed completely</li>
188+
<li><strong>Page numbers</strong> — Lines containing only digits (e.g., "1", "2", "3")</li>
189+
<li><strong>Headers/Footers</strong> — Common patterns like "Page 1 of 5", "Confidential", "DRAFT"</li>
190+
<li><strong>Duplicate lines</strong> — Consecutive identical lines are collapsed into one</li>
191+
</ul>
192+
</div>
193+
</div>
194+
195+
<div class="how-it-works-step">
196+
<div class="step-number">3</div>
197+
<div class="step-content">
198+
<h3>Clean Output</h3>
199+
<p>The cleaned text is assembled from the remaining lines, preserving the original formatting and structure while removing all noise.</p>
200+
</div>
201+
</div>
202+
</div>
203+
204+
<div class="how-it-works-note">
205+
<p><strong>🔒 Privacy First:</strong> All processing happens entirely in your browser. Your files never leave your computer — no uploads, no server-side processing, complete privacy.</p>
206+
</div>
207+
</div>
208+
</div>
209+
</section>
162210
</main>
163211

164212
<footer class="footer">

0 commit comments

Comments
 (0)