HTML to PDF with Chromium
Sometimes, we need to render a HTML web page into PDF. Some usecases include:
- Rendering Markdown to PDF, with HTML as an intermediary step
- Designing print media (posters, flyers, resume's) using HTML/CSS
- Archiving web pages for easy access
I've seen a couple of resources recommending
wkhtmltopdf
.
Do not use wkhtmltopdf
.
wkhtmltopdf
is horribly outdated.
It's considered insecure,
meaning that running it on untrusted HTML pages
risks a malware infection.
And even if you fully trust the input data,
the PDF likely won't get rendered correctly,
because
wkhtmltopdf
doesn't support many modern web features such
as flexbox.
Instead, you can just use your install of chromium, or any chromium-based web browser. Using some command-line flags, we can tell chromium to use the printing system to render a web page to PDF. Example:
chromium \
--headless \
--print-to-pdf=output.pdf \
--no-margins \
--no-pdf-header-footer \
--generate-pdf-document-outline \
"file://$(realpath input.html)"
Note the --generate-pdf-document-outline
flag.
It tells chromium to generate a PDF Outline using the
h1, h2, h3, ...
tags found in the document.
It's not strictly necessary,
but people who use PDF Outline will appreciate it.
Inside containers
If you need to use this trick inside containerized
environments (e.g. docker, podman),
you need to add --no-sandbox
and --disable-gpu
options:
chromium \
--headless \
--no-sandbox \
--disable-gpu \
--print-to-pdf=output.pdf \
--no-margins \
--no-pdf-header-footer \
--generate-pdf-document-outline \
"file://$(realpath input.html)"
Readonly containers
If you must use a readonly container,
make sure that the home directory of the user running
chromium
is writeable (i.e. a TMPFS).
Otherwise, chromium will crash.
With flatpak
If you have Chromium (or a fork of Chromium) installed via flatpak, you can use it to produce PDF files like this:
flatpak run \
--filesystem=$(realpath .) \
org.chromium.Chromium \
--headless \
--print-to-pdf=output.pdf \
--no-margins \
--no-pdf-header-footer \
--generate-pdf-document-outline \
"file://$(realpath input.html)"
Here we add the --filesystem
option to forward the current directory
(which contains input.html
and will contain output.pdf
)
into the sandbox.
Full bleed background
Say, for example, you're designing a poster using HTML and CSS, and you decide to set a background color:
body {
background-color: #EEE
}
If you want the resulting PDF to not have a white margin around the document, you also need to add the following CSS to your page:
@media print {
@page { margin: 0; }
body { margin: 0; width: 100vw; height: 100vh; }
}
This removes any sort of margin between the edge
of the PDF paper and the body
element.
Page sizes
Specifying the page size can also be done by adding css to the HTML page:
@page {
margin: 0;
padding: 0;
size: 210mm 297mm;
/* A4 Page */
}