, 6 min read
Calling MD4C from PHP via FFI
Original post is here eklausmeier.goip.de/blog/2021/07-11-calling-md4c-from-php-via-ffi.
1. Problem statement. When using one of the static site generators an important part of all of them is to convert Markdown to HTML. In my case I use Saaze, and I measured roughly 60% of the overall runtime is used for converting Markdown to HTML. I have written on Saaze here and here. When converting my roughly 320 posts it took two seconds. When my machine is fully loaded with other computations, for example astrophysical computations, then converting to static takes four seconds. Of that total runtime more than 60% were only located in the toHtml()
routine.
PHP 8 offers FFI, Foreign Function Interface. It was inspired by LuaJIT FFI. PHP FFI is a very easy to use interface to C routines, authored by Dmitry Stogov. Although writing a PHP extension is quite easy, calling C routines via FFI is dead-simple. Hence, it was natural to substitute the toHtml()
in PHP with MD4C.
FFI has to be enabled in php.ini
, see for example PHP extension seg-faulting.
2. C library. MD4C is a C library and auxiliary stand-alone executable to convert Markdown to HTML. It was written by Martin Mitas. It is installed on many Linux distributions by default, as it is used in Qt.
MD4C is very fast. It is faster than cmark. In many cases it is 2-5 times faster than cmark. See Why is MD4C so fast?.
Test name | Simple input | MD4C (seconds) | Cmark (seconds) |
---|---|---|---|
cmark-benchinput.md | (benchmark from CMark) | 0.3650 | 0.7060 |
long-block-multiline.md | "foo\n " * 1000000 |
0.0400 | 0.2300 |
long-block-oneline.md | "foo " * 10 * 1000000 |
0.0700 | 0.1000 |
many-atx-headers.md | "###### foo\n " * 1000000 |
0.0900 | 0.4670 |
many-blanks.md | "\n " * 10 * 1000000 |
0.0700 | 0.3110 |
many-emphasis.md | "*foo* " * 1000000 |
0.1100 | 0.8460 |
many-fenced-code-blocks.md | "~~~\nfoo\n~~~\n\n " * 1000000 |
0.1600 | 0.4010 |
many-links.md | "[a](/url) " * 1000000 |
0.2100 | 0.5110 |
many-paragraphs.md | "foo\n\n " * 1000000 |
0.0900 | 0.4860 |
Here is another speed comparison between cmark, md4c and commonmark.js:
Implementation | Time (sec) |
---|---|
commonmark.js | 0.59 |
cmark | 0.12 |
md4c | 0.04 |
3. PHP and C code. The code to be called instead of toHtml
is therefore:
$ffi = FFI::cdef("char *md4c_toHtml(const char*);","/srv/http/php_md4c_toHtml.so");
$html = FFI::string( $ffi->md4c_toHtml($markdown) );
For testing the call to md4c_toHtml()
use below PHP program with a string of Markdown as first argument:
<?php
$ffi = FFI::cdef("char *md4c_toHtml(const char*);","/srv/http/php_md4c_toHtml.so");
printf("argv1 = %s\n", $argv[1]);
$markdown = file_get_contents($argv[1]);
$html = FFI::string( $ffi->md4c_toHtml($markdown) );
printf("%s", $html);
The routine md4c_toHtml()
is:
/* Provide md4c to PHP via FFI
Copied many portions from Martin Mitas:
https://github.com/mity/md4c/blob/master/md2html/md2html.c
Compile like this:
cc -fPIC -Wall -O2 -shared php_md4c_toHtml.c -o php_md4c_toHtml.so -lmd4c-html
This routine is not thread-safe. For threading we either need a thread-id passed
or using a mutex to guard the static/global mbuf.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <md4c-html.h>
struct membuffer {
char* data;
size_t asize;
size_t size;
};
static void membuf_init(struct membuffer* buf, MD_SIZE new_asize) {
buf->size = 0;
buf->asize = new_asize;
if ((buf->data = malloc(buf->asize)) == NULL) {
fprintf(stderr, "membuf_init: malloc() failed.\n");
exit(1);
}
}
static void membuf_grow(struct membuffer* buf, size_t new_asize) {
buf->data = realloc(buf->data, new_asize);
if(buf->data == NULL) {
fprintf(stderr, "membuf_grow: realloc() failed.\n");
exit(1);
}
buf->asize = new_asize;
}
static void membuf_append(struct membuffer* buf, const char* data, MD_SIZE size) {
if(buf->asize < buf->size + size)
membuf_grow(buf, buf->size + buf->size / 2 + size);
memcpy(buf->data + buf->size, data, size);
buf->size += size;
}
static void process_output(const MD_CHAR* text, MD_SIZE size, void* userdata) {
membuf_append((struct membuffer*) userdata, text, size);
}
static struct membuffer mbuf = { NULL, 0, 0 };
char *md4c_toHtml(const char *markdown) { // return HTML string
int ret;
if (mbuf.asize == 0) membuf_init(&mbuf,16777216);
mbuf.size = 0; // prepare for next call
ret = md_html(markdown,strlen(markdown),process_output,&mbuf,MD_DIALECT_GITHUB,0);
membuf_append(&mbuf,"\0",1); // make it a null-terminated C string, so PHP can deduce length
if (ret < 0) return "<br>- - - Error in Markdown - - -<br>\n";
return mbuf.data;
}
3. Application range. Any PHP based static-site generator would therefore profit if it simply used MD4C. But also any PHP based CMS employing Markdown. For a list of generators see Jamstack.
- Jigsaw, based on Blade templates like Saaze
- Statamic, commercial license
- Stati by Jonathan Foucher, Jekyll compatible
- Saaze
- Pico CMS, flat file CMS using Twig templates, not a static site generator
- Grav, a flat-file CMS
- Sculpin, static site generator using Twig templates
4. Benchmarks. Benchmarks were run on a fully loaded machine:
1[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Tasks: 116, 352 thr; 8 running
2[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Load average: 7.54 7.64 7.59
3[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%] Uptime: 14 days, 07:28:59
4[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
5[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
6[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
7[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
8[|||||||||||||||||||||||||||||||||||||||||||||||||||||||||100.0%]
Mem[|||||||||||||||||||||||||||||||||||||||||||||||||| 35.6G/60.8G]
Swp[ 0K/0K]
PID USER PRI NI VIRT RES SHR S CPU%â–½MEM% TIME+ Command
449817 edh 20 0 45.7G 34.8G 112M S 793. 55.9 936h /usr/bin/python /usr/bin/ipython -i script/playground.py
449911 edh 20 0 45.7G 34.8G 112M R 99.8 55.9 101h /usr/bin/python /usr/bin/ipython -i script/playground.py
449913 edh 20 0 45.7G 34.8G 112M R 99.8 55.9 101h /usr/bin/python /usr/bin/ipython -i script/playground.py
449918 edh 20 0 45.7G 34.8G 112M R 99.8 55.9 120h /usr/bin/python /usr/bin/ipython -i script/playground.py
449909 edh 20 0 45.7G 34.8G 112M R 99.1 55.9 102h /usr/bin/python /usr/bin/ipython -i script/playground.py
449914 edh 20 0 45.7G 34.8G 112M R 99.1 55.9 101h /usr/bin/python /usr/bin/ipython -i script/playground.py
449915 edh 20 0 45.7G 34.8G 112M R 99.1 55.9 101h /usr/bin/python /usr/bin/ipython -i script/playground.py
449912 edh 20 0 45.7G 34.8G 112M R 98.4 55.9 102h /usr/bin/python /usr/bin/ipython -i script/playground.py
449910 edh 20 0 45.7G 34.8G 112M R 97.8 55.9 101h /usr/bin/python /usr/bin/ipython -i script/playground.py
565502 klm 20 0 41.6G 313M 151M S 2.0 0.5 0:58.04 /usr/lib/brave-bin/brave --type=renderer --field-trial-handle=5497587067748927688,
563438 klm 20 0 1379M 107M 71204 S 0.7 0.2 0:07.56 /usr/lib/Xorg vt7 -displayfd 3 -auth /run/user/1000/gdm/Xauthority -nolisten tcp -
563557 klm 20 0 1138M 385M 182M S 0.7 0.6 0:06.74 /usr/lib/brave-bin/brave
564279 klm 20 0 848M 69248 55692 S 0.7 0.1 0:02.30 /usr/lib/brave-bin/brave --type=utility --utility-sub-type=audio.mojom.AudioServic
564290 klm 9 -11 671M 13736 9412 S 0.7 0.0 0:05.52 /usr/bin/pulseaudio --daemonize=no --log-target=journal
565585 klm 20 0 41.6G 313M 151M S 0.7 0.5 0:05.11 /usr/lib/brave-bin/brave --type=renderer --field-trial-handle=5497587067748927688,
566103 klm 20 0 9772 5356 3536 R 0.7 0.0 0:00.73 htop
1 root 20 0 169M 7760 4664 S 0.0 0.0 1:48.04 /sbin/init
In my case, using Saaze on a heavily loaded machine, runtimes previously were:
$ time php saaze build
Building static site in /home/klm/tmp/sndsaaze/build...
execute(): filePath()=/home/klm/tmp/sndsaaze/content/blog.yml, entries=1, totalPages=11, entries_per_page=30
execute(): filePath()=/home/klm/tmp/sndsaaze/content/music.yml, entries=1, totalPages=1, entries_per_page=30
execute(): filePath()=/home/klm/tmp/sndsaaze/content/pages.yml, entries=1, totalPages=1, entries_per_page=30
Finished creating 3 collections and 315 entries (3.46 secs / 10.79MB), md2html=2.1529788970947, MathParser=0.13162446022034
real 3.58s
user 2.92s
sys 0
swapped 0
total space 0
The time for MD4C is roughly two seconds. The MathParser for handling Twitter, YouTube, etc. needs extra 0.1 seconds.
Now this CPU time went down:
$ time php saaze build
Building static site in /home/klm/tmp/sndsaaze/build...
execute(): filePath()=/home/klm/tmp/sndsaaze/content/blog.yml, entries=1, totalPages=11, entries_per_page=30
execute(): filePath()=/home/klm/tmp/sndsaaze/content/music.yml, entries=1, totalPages=1, entries_per_page=30
execute(): filePath()=/home/klm/tmp/sndsaaze/content/pages.yml, entries=1, totalPages=1, entries_per_page=30
Finished creating 3 collections and 315 entries (1.99 secs / 10.29MB), md2html=0.27019762992859, MathParser=0.13629722595215
real 2.12s
user 1.27s
sys 0
swapped 0
total space 0
The time for MD4C is roughly 0.27 seconds for 315 entries. That is almost 8-times faster than previously.