You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Add detailed 'How It Works' section to website explaining the cleaning algorithm
- Create Telegram message file explaining the tool's functionality
- Add beautiful styling for the new section with step-by-step explanation
- Include privacy note about browser-only processing
<p>You can also use DocStripper as a CLI tool. Check out our <ahref="https://github.com/kiku-jw/DocStripper2#installation" target="_blank">GitHub repository</a> for installation instructions.</p>
160
160
</div>
161
161
</section>
162
+
163
+
<!-- How It Works Section -->
164
+
<sectionclass="how-it-works">
165
+
<divclass="container">
166
+
<h2>How It Works</h2>
167
+
<divclass="how-it-works-content">
168
+
<pclass="how-it-works-intro">
169
+
DocStripper uses a simple but effective line-by-line cleaning algorithm to remove noise from your documents:
170
+
</p>
171
+
172
+
<divclass="how-it-works-steps">
173
+
<divclass="how-it-works-step">
174
+
<divclass="step-number">1</div>
175
+
<divclass="step-content">
176
+
<h3>Read & Extract</h3>
177
+
<p>The tool reads your file (TXT or DOCX) and extracts all text content. For DOCX files, it extracts text from the document structure.</p>
178
+
</div>
179
+
</div>
180
+
181
+
<divclass="how-it-works-step">
182
+
<divclass="step-number">2</div>
183
+
<divclass="step-content">
184
+
<h3>Line-by-Line Analysis</h3>
185
+
<p>Each line is analyzed and filtered based on several criteria:</p>
<li><strong>Headers/Footers</strong> — Common patterns like "Page 1 of 5", "Confidential", "DRAFT"</li>
190
+
<li><strong>Duplicate lines</strong> — Consecutive identical lines are collapsed into one</li>
191
+
</ul>
192
+
</div>
193
+
</div>
194
+
195
+
<divclass="how-it-works-step">
196
+
<divclass="step-number">3</div>
197
+
<divclass="step-content">
198
+
<h3>Clean Output</h3>
199
+
<p>The cleaned text is assembled from the remaining lines, preserving the original formatting and structure while removing all noise.</p>
200
+
</div>
201
+
</div>
202
+
</div>
203
+
204
+
<divclass="how-it-works-note">
205
+
<p><strong>🔒 Privacy First:</strong> All processing happens entirely in your browser. Your files never leave your computer — no uploads, no server-side processing, complete privacy.</p>
0 commit comments