, 3 min read

Making Static Site Generator 30-Times Faster

Original post is here eklausmeier.goip.de/blog/2021/10-19-making-static-site-generator-30-times-faster.


Static site generators take Markdown files as input and generate HTML files. This way, when the end-user visits a website, all pages are already generated, and the web-server serves just plain HTML. No database is needed, no server-side script language is required. Two main advantages are easy to spot:

  1. Serving fixed HTML files is much faster than generating HTML from database and server-side scripting language on the fly
  2. Attack surface is much smaller

This blog is generated using Saaze, which I use as static site generator. I initially just modified the toHtml() function to accomodate for MathJax, for YouTube and Vimeo video, Twitter, Codepen, etc. Then I found out that the conversion from Markdown to HTML in Saaze is a major part of CPU consumption. Therefore I wrote a very simple PHP-FFI program to call MD4C, a C routine to do this conversion. This already cut the CPU time in half. See Calling MD4C from PHP via FFI.

Looking closer at the source code of Saaze I saw that many functions, which are already available in PHP itself, where reprogrammed in separate PHP libraries, see Considerations Regarding Simplifications of Saaze. One example is the Yaml-parser, see yaml_parse. This added a number of dependencies to Saaze.

When I instrumented the Saaze code I saw that some functions are called way more than could anyhow be sensible. Therefore, I tried to better understand how Saaze works and optimize the work which actually needs to be done by Saaze. This modified version of Saaze is called "Simplified Saaze". It follows the ideas sketched out in Considerations Regarding Simplifications of Saaze. There I mentioned that many parts in Saaze are simply not necessary and adding weight without adding useful functionality.

When the old Saaze processes this blog with roundabout 330 blog posts, it needs ~10s.

$ time php saaze build   
Building static site in /home/klm/tmp/thdsaaze/build...
Finished creating 2 collections and 329 entries (8.69 secs / 11.23MB)
        real 8.74s
        user 8.45s
        sys 0
        swapped 0
        total space 0

The "Simplified Saaze" needs just 1/3s. It is important to note, that the "Simplified Saaze" is doing a lot more work, i.e., handling math, YouTube, Twitter, etc.

$ time php saaze build 
Building static site in /home/klm/tmp/sndsaaze/build...
        execute(): filePath=/home/klm/tmp/sndsaaze/content/blog.yml, nentries=329, totalPages=11, entries_per_page=30
        execute(): filePath=/home/klm/tmp/sndsaaze/content/music.yml, nentries=11, totalPages=1, entries_per_page=30
        execute(): filePath=/home/klm/tmp/sndsaaze/content/pages.yml, nentries=2, totalPages=1, entries_per_page=30
Finished creating 3 collections and 342 entries (0.28 secs / 12.19MB)
#collections=3, YamlParser=0.0130/345-3, md2html=0.0130, MathParser=0.0061/342, toObject=0.0715/716, renderEntry=342, content=684/342, excerpt=342/0
        real 0.34s
        user 0.26s
        sys 0
        swapped 0
        total space 0

The ratio is therefore 8.74/0.34=25.7, or taking the user times, 8.45/0.26=32.5.

Above ratio is favoring the old Saaze code quite a bit: If I would include this particular post, Profiling PHP Programs, then the old Saaze would need more than 100s:

$ time php saaze build                                                                                               
Building static site in /home/klm/tmp/thdsaaze/build...                                                                                    
Finished creating 2 collections and 330 entries (104.99 secs / 15.37MB)                                                                    
        real 105.06s
        user 102.93s
        sys 0
        swapped 0
        total space 0

I left out this special post to not accentuate the effect of MD4C too much.

So one can clearly see that optimizing at different levels can lead to a substantial CPU time reduction. Obviously, Parsedown Extra is inefficient.