Printing Blogs

6th of May, 2026

Printing out blog posts greatly elevates the reading experience. I recently tried it and had a lot of fun highlighting, margin-scribbling, dog-earing. This made me want to try printing out entire blogs, and here we'll do exactly that. The preview cover pages below link to printable excerpts of paulgraham.com, marginalrevolution.com, maxhodak.com, guzey.com, and this blog.

For the first printee, what else could it be but Paul Graham's essays. Fetching all 228, a cool 550k words, we already approach 1000 pages with reasonable print settings on A4 paper. Assuming a 20€ printing budget and 0.1€ per page, we instead aim for a 75k word / ~150 page digest for now. But what posts do we want to include? This is a Knapsack problem: Each post comes with a cost, the number of words, and a value, how interesting it is to read.

Number of words is easy to determine, what's hard is the subjective value of reading a post. The sane thing to do would be manual curation, but to reduce the marginal cost of adding a new blog, we should automate things. First, we should respect the author's opinion. We give an author-recommended post 1 point to start with, the others have 0. In pg's case, three essays are recommended on the essay index page, which is not enough discriminative power for us. To triage the other articles, it wasn't obvious what to do. I don't trust language models for this, and any criterion that's specific to the site won't save us much time when adding a new one over a manual selection. The best solution seems to be the Google page rank of the page for the query "site:{blog_base_url}". We query this with the Google search API. Each query gets additional 1/(m+1) points, where m is the page rank, that is, we assume quality is Zipf-distributed. The offset is chosen as the smallest integer so that all three pg-recommended essays are included.

After solving the Knapsack problem with the word count as the cost, we have a list of 41 essays approved for print.

Since 75k words worth of essays without any clustering is still overwhelming, chapters would be nice. Unfortunately this blog doesn't have tags or native categories, which would be preferred. I was so naive as to think language models could by now come up with good clusters, but they turned out quite bad, so we write the clusters manually.

A simple cover page in Typst, a main body rendered in pandoc with more or less standard settings, and we're ready to render. If you've decided you want a copy yourself, now is the point in the story where you join the action: printing. The correct solution would be a print and bind service, or at the very least a professional grade printer. I had a simple home printer and stapler only, so let us see how far we can get with that. Separating out the print job into the cover page with high quality color, and two runs for odd and even pages in black and white, this took around 30 minutes.

After the stack of 70 sheets was done, I realized I had not thought about how to attach them to each other, and my home stapler certainly wouldn't do all 70 at once. I landed on stapled batches of 13 pages whose first and last 2 are connected with a second staple. In retrospect larger batches would probably have been fine, but it at least allowed me to get some practice in. The result is a reasonably pretty and readable booklet, ready for the nightstand.

Next up, the one and only Marginal Revolution. It immediately broke the pipeline with its high number of short posts. The best way to select these would probably be picking by hand, but the whole point was for me to read posts I hadn't gotten around to yet, so I just asked a language model for a curated list. These were fine, but I imagine a seasoned MR reader would turn up a much better selection, so I erred on the shorter side with only 68 pages. For the cover image, I wanted something visually pretty from economic theory, so I made an Edgeworth Box diagram. Other adjustments needed were image formatting, removal of fluff images, and the author byline.

Max Hodak's blog is another really great and printable blog, short enough to format in its entirety. The only adjustments needed for formatting were getting code and math right. For the cover, I tried using some of my drawings, but ended up unsatisfied with both the drawings themselves and the result of the scanning process, so I instead used a crop of the painting The Agnew Clinic by Thomas Eakins.

Most of this site, fi-le.net, turns out to be pretty hard to print. I try to get creative with the html medium, so some articles are heavy on interactive plots and other things that don't translate well to paper. I excluded many, and for the remaining figures, rendered from html to a still png with a headless browser. As the cover, the only purely decorative image used on the site had to do, a digital study I did of a painting by John Singer Sargent.

As a formatting final boss of sorts, we turn to Alexey Guzey's awesome blog. Here we encounter lots of images, links, call-to-actions, metadata etc. that need treatment for print. The index with recommended posts helps for selection. For the cover, an assembly illustration of an IBM printer felt appropriate because it reflects the listlessly listing, dissecting nature of the site.

These five printouts should keep me occupied for a while, but of course there are many more deserving of a print edition. If you have suggestions, requests for modifications, et cetera, feel free to mail at info@fi-le.net or open a PR on the repo for this project.

I think the blog as a medium is still underappreciated, in part because blogs don't have a physical representation. We've changed that a little bit today, and indeed, picking them up from my desk, I feel the ideas presented now carry more weight.

For a previous entry in this mini-series about software that interacts with the blogging ecosystem, see this one about semi-automatically spell-correcting a few hundred websites.