I've followed this article to use FlyingSaucer to convert XHTML to PDF and it's brilliant but has one major downfall... it's ridiculously slow!
I'm finding that it takes between 1 and 2 minutes to render a PDF from an XHTML, regardless of how simple that page is.
Basic code:
import java.io.File;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import org.xhtmlrenderer.pdf.ITextRenderer;
import com.lowagie.text.DocumentException;
public class FirstDoc {
public static void main(String[] args) throws IOException, DocumentException {
String inputFile = "firstdoc.xhtml";
String url = new File(inputFile).toURI().toURL().toString();
String outputFile = "firstdoc.pdf";
OutputStream os = new FileOutputStream(outputFile);
ITextRenderer renderer = new ITextRenderer();
renderer.setDocument(url);
renderer.layout();
renderer.createPDF(os);
os.close();
}
}
Sample XHTML:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>My First Document</title>
<style type="text/css"> b { color: green; } </style>
</head>
<body>
<p>
<b>Greetings Earthlings!</b>
We've come for your Java.
</p>
</body>
</html>
Does anyone know how to improve the performance of FlyingSaucer?
Failing that, is anyone able to recommend an alternative Java library which is effective at rendering a PDF from a URL to an (X)HTML document with external CSS and images generated from URLs?
I was facing the same problem as Edd.
Sadly the next approach didn't work Java DocumentBuilder: xml parsing is very slow? by Marek Piechut completely for me - my HTML entities got lost on the way.
DocumentBuilderFactory fac = DocumentBuilderFactory.newInstance();
fac.setNamespaceAware(false);
fac.setValidating(false);
fac.setFeature("http://xml.org/sax/features/namespaces", false);
fac.setFeature("http://xml.org/sax/features/validation", false);
fac.setFeature("http://apache.org/xml/features/nonvalidating/load-dtd-grammar", false);
fac.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
DocumentBuilder builder = fac.newDocumentBuilder();
What finally did the trick were these lines:
DocumentBuilderFactory fac = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = fac.newDocumentBuilder();
builder.setEntityResolver(FSEntityResolver.instance());
By using the built-in Java EntityResolver for resolving the DTD it got faster tremendously.