A tokeniser based HTML and CSS document parser and minifier, written in PHP.
This project is currently alpha code, it is currently not recommended to deploy this code into production
An HTML and CSS parser, primarily designed for minifying HTML documents, although the plan is to also allow the document structure to be queried so attribute values and text node values can be extracted.
Both parsers are designed around a tokeniser to make the document processing more reliable and (hopefully) faster than regex based minifiers, which are a bit blunt and can be problematic if they match patterns in the wrong places.
Also because documents are read into a structured format, performing operations on specific parts of it is much easier and more reliable, and will in the future enable documents to be queried and data extracted.
To minify an HTML document:
use hexydec\html\htmldoc;
$doc = new htmldoc();
// load from a variable
$doc->load($html);
// load from a URL
$doc->open($url);
// minify the document
$doc->minify();
// compile back to HTML
echo $doc->save();Find out more details of how to use HTMLDoc in the docs