-
Notifications
You must be signed in to change notification settings - Fork 38
HTML Rewriter #1193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTML Rewriter #1193
Conversation
headers: { 'Content-Type': 'text/html' }, | ||
}).text(); | ||
strictEqual(textEscape, expectedEscape); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have a test that makes sure Content-Type directives also work ok? As in, that they're still correctly detected as HTML?
Additionally, is there any way to force this to run for, e.g. xhtml, xml, or other things that might technically be parseable? I guess you're expected to just wrap it in new Response(incomingBody, { headers: { 'Content-Type': 'text/html' } })
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rewriter isn't affected by the Content-Type
headers, so I'm not sure we need that here, but happy to add it if you still think it's worthwhile!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, if it doesn't care about the content-type header, then that's fine. Since you were using it in every test, I assumed that meant the rewriter wouldn't fire unless it detected HTML content type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Thanks for adding this! Looked for something like this in Fastly's JS API a month ago and didn't find it so I ended up writing my own (non-streaming) transformer using htmlparser2 and cheerio. I switched over to this one today and performance is way better (4s to transform 1000 small documents locally down to <1) and my WASM file size dropped by 7 mb. |
@IGx89 great to hear, thank you! |
Adds an HTML rewriter feature with an interface mostly the same as Akamai's html-rewriter. Uses the
lol-html
library under the hood.A few potentially questionable design decisions which we may want to go a different way on:
lol-html
's C API into this repo. This was done for a few reasons:rustc
version that we build with, so I patched that change.lol-html
repo and is not published on crates.io.lib.name
in itsCargo.toml
, which is incompatible with StarlingMonkey'sadd_rust_lib
function, so I patched that too.rustc
we compile with (which is a change made in upstream StarlingMonkey already), patchadd_rust_lib
to support customlib.name
s, then grab the C API crate throughcargo
with a simple wrapper library, similar to the existing crates in StarlingMonkey. Alternatively, we could make a fork oflol-html
and point at that instead.insert_implicit_close
feature of Akamai's API, because it seems less important and I don't see a simple way to do this fromlol-html
. I also extended their API with aescapeHTML
option for all insertion functions, which allows inserting HTML content as text.fastly:html-rewriter
. Ideally someone who knows their way around the module system better can tell me what to do here 😄BEGIN_COMMIT_OVERRIDE
feat: HTML Rewriter
END_COMMIT_OVERRIDE