Move to pandoc for markdown and premailer for CSS #16

guyest · 2018-12-14T18:34:24Z

This is a proof of principle, but fully functional pull request. For my purposes, using the pandoc variant of markdown just makes the most sense. I also think it is the most robust and featureful so require it as an option in any markdown based tool I use. It was pretty easy to implement that in muttdown, leveraging other packages already available. pyliner also does not seem all that well maintained so I opted for premailer to perform the CSS in-lining instead, which seems reasonable since it is well tested and designed for this very purpose.

I currently have this version of muttdown set up as a filter on my local Postfix install and it works great. The last detail at this stage I would like to see changed is to remove the !m sigil from the text/plain part. That is not currently written back into the multipart object.

So, I am going to continue down this path regardless of whether or not you would like to merge, but if this all seems in line with what you wanted to do then I am happy to have that happen in muttdown's birth repo (this one).

…iple); using premailer for CSS inlining.

Roguelazer

I'm okay using premailer, since it does indeed appear to be better-maintained than pynliner.

I'm less confident about making pandoc a requirement; I would generally not use pypandoc's statically-compiled gigantic pandoc binary, and I imagine a lot of people don't want the entire GHC toolchain in order to send email. Perhaps making it an optional dependency and using it if it's present, and otherwise using the pure-python Markdown interpreter would be a better option. I'm also pretty sure I just found a security bug in pypandoc in 30 seconds of inspection, which is sad-making.

See below for some style notes and bugs.

Roguelazer · 2018-12-14T18:54:06Z

muttdown/main.py

        if not text.startswith('!m'):
            return None
        text = re.sub('\s*!m\s*', '', text, re.M)
+        f = tempfile.NamedTemporaryFile(suffix='_panmail.css')


temporary files should always be used as context managers so that they get cleaned up on pypy

This bothered me as well -- to be sure. I did initially implement it that way. The reason I changed was to try and change less of the underlying code structure. I think handling the signature separately is where I ran into an awkward situation so found my way around it by just manually closing the file later (which also deletes it). Can do better on that though.

Roguelazer · 2018-12-14T18:55:28Z

muttdown/main.py

-            md += '<br />'.join(signature.split('\n'))
-            md += '</p></div>'
+            message  = pypandoc.convert_text(pre_signature, 'html5', format='md', \
+                    extra_args=["--css="+f.name, "--self-contained", "--metadata=pagetitle:'email'"])


preferred style would be something like

message = pypandoc.convert_text( pre_signature, 'html5', format='md', extra_args=['--css=' + f.name, ...] )

Roguelazer · 2018-12-14T18:55:38Z

muttdown/main.py

            return None
        text = re.sub('\s*!m\s*', '', text, re.M)
+        f = tempfile.NamedTemporaryFile(suffix='_panmail.css')
+        if config.css: f.write(config.css)


never do one-line if statements in python

Roguelazer · 2018-12-14T18:56:13Z

muttdown/main.py

-            md = pynliner.fromString(md)
-        message = MIMEText(md, 'html', _charset="UTF-8")
+            message  = pypandoc.convert_text(text, 'html5', format='md', \
+                    extra_args=["--css="+f.name, "--self-contained", "--metadata=pagetitle:'email'"])


see above comment regarding correct indentation for multiline function call arguments

Roguelazer · 2018-12-14T18:56:42Z

muttdown/main.py

+                    extra_args=["--css="+f.name, "--self-contained", "--metadata=pagetitle:'email'"])
+        message = premailer.transform(message) # In-line the CSS.
+        message = MIMEText(message, 'html', _charset="UTF-8")
+        f.close()


this will no longer be necessary when you switch to a context manager

Roguelazer · 2018-12-14T18:57:08Z

muttdown/main.py

        print(rebuilt.as_string())
    elif args.sendmail_passthru:
-        cmd = c.sendmail.split() + ['-f', args.envelope_from] + args.addresses
+        cmd = c.sendmail.split() + ['-G', '-i', '-f', args.envelope_from] + args.addresses


-G would need to be controlled by a config flag since it would only be appropriate when using this as a postfix tool

Roguelazer · 2018-12-14T18:57:30Z

muttdown/main.py


        proc = subprocess.Popen(cmd, stdin=subprocess.PIPE, shell=False)
-        proc.communicate(rebuilt.as_string())
+        proc.communicate(rebuilt.as_string().encode())


calling encode without an argument has pretty confusing behavior on a lot of systems since it tries to guess the default encoding

Arguable, I think the default is actually the least confusing possible behavior. I would opt instead for just dropping support of legacy python, which is really the root cause of confusion around strings/bytes. Especially with the additional decode bound method that converts the html from base64 to usual text encoding.

guyest · 2018-12-14T20:18:32Z

Regarding the size of pandoc as a requirement, this is a sore point with Haskell code in general. The binary file is friggin tiny (29KiB), but yes it can be confusingly obscene to install if you are not used to how Haskell does things. That said, it is probably fair that it should not be the only option for markdown processing, but that just introduces another branch point as slightly different conventions may need to be handled and the html output from either processor will have their own peculiarities.

Could you elaborate on the suspected security vulnerability in pypandoc?

Also, LOL: "All checks have failed." I bet Travis had fun trying to build pandoc like 7 times.

gfa · 2019-01-22T07:01:35Z

Just one note, moving out from pynliner would help muttdown from the contrib to main area of the Debian archive.

premailer is not yet in Debian, I read its license and it looks good (pending approval from ftp master) to inclusion in main. I did not check premailer dependencies (if any).

If muttdown switches i'd package premailer, and deps, and when all deps are in Debian i'd switch muttdown from contrib to main.

Moving to pandoc for markdown (or anything else it supports, in princ…

caf5e6b

…iple); using premailer for CSS inlining.

Roguelazer requested changes Dec 14, 2018

View reviewed changes

Move to pandoc for markdown and premailer for CSS #16

Are you sure you want to change the base?

Move to pandoc for markdown and premailer for CSS #16

Uh oh!

Conversation

guyest commented Dec 14, 2018

Uh oh!

Roguelazer left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

guyest commented Dec 14, 2018

Uh oh!

gfa commented Jan 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Roguelazer left a comment •

edited

Loading