I'am looking for some "stable" method to convert DOCX file from MS WORD into PDF. Since now I have used OpenOffice installed as listener but it often hangs. The problem is that we have situations when many users want to convert SXW,DOCX files into PDF at the same time. Is there some other possibility? I tryed with examples from this site: https://angelozerr.wordpress.com/2012/12/06/how-to-convert-docxodt-to-pdfhtml-with-java/ but the output result is not good (converted documents have errors and layout is quite modified).
here is "source" docx document:
here is document converted with docx4j with some exception text inside document. Also the text in upper right corner is missing.
this one is PDF created with OpenOffice as converter from docx to pdf. Some text is missing "upper right corner"
Is there some other option to convert docx into pdf with Java?
There are lot of methods to do conversion One of the used method is using POI and DOCX4j
InputStream is = new FileInputStream(new File("your Docx PAth"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage
.load(is);
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {
wordMLPackage.getDocumentModel().getSections().get(i)
.getPageDimensions();
}
Mapper fontMapper = new IdentityPlusMapper();
PhysicalFont font = PhysicalFonts.getPhysicalFonts().get(
"Comic Sans MS");//set your desired font
fontMapper.getFontMappings().put("Algerian", font);
wordMLPackage.setFontMapper(fontMapper);
PdfSettings pdfSettings = new PdfSettings();
org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(
wordMLPackage);
//To turn off logger
List<Logger> loggers = Collections.<Logger> list(LogManager
.getCurrentLoggers());
loggers.add(LogManager.getRootLogger());
for (Logger logger : loggers) {
logger.setLevel(Level.OFF);
}
OutputStream out = new FileOutputStream(new File("Your OutPut PDF path"));
conversion.output(out, pdfSettings);
System.out.println("DONE!!");
This works perfect and even tried on multiple DOCX files.