Daniel Lemire's blog

, 1 min read

Why libxml2 takes forever to transform XHTML files

For my Online XML Course, I process lots and lots of XHTML content using XSLT. Up until now, I avoided the libxml2 XSLT processor (xstlproc) because it was unacceptably slow.

Today, I found out what is happening. It loads all of the XHTML DTD files from W3C each and every time it processes an XHTML file. To get around the problem, use the –novalid flag when invoking the xsltproc command line. You might get warnings about problematic entities, so I suggest you try a DOCTYPE declaration like the following:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[
<!ENTITY laquo "&#171;">
<!ENTITY raquo "&#187;">
<!ENTITY oelig "&#339;">
<!ENTITY nbsp "&#160;">
]>