Converting docx/odt to PDF using JavaScript

ncohen picture ncohen · May 11, 2014 · Viewed 19.6k times · Source

I have a node web app that needs to convert a docx file into pdf (using client side resources only and no plugins). I've found a possible solution by converting my docx into HTML using docxjs and then HTML to PDF using jspdf (docx->HTML->PDF). This solution could make it but I encountered several issues especially with rendering. I know that docxjs doesn't keep the same rendering in HTML as the docx file so it is a problem...

So my question is do you know any free module/solution that could directly do the job without going through HTML (I'm open to odt as a source as well)? If not, what would you advise me to do?

Thanks

Answer

zarkone picture zarkone · May 14, 2014

As you already know there is no ready-to-use and open libs for this.. You just can't get good results with available variants. My suggesition is:

  1. Use third party API. Like https://market.mashape.com/convertapi/word2pdf-1#!documentation
  2. Create your own service for this purpose. If you have such ability, I suggest to create a small server on node.js (I bet you know how to do this). You can use Libreoffice as a good converter with good render quality like this:

    libreoffice -headless -invisible -convert-to pdf {$file_name} -outdir /www-disk/

    Don't forget that this is usually takes a lot of time, do not block the request-answer flow: use separate process for each convert operation.

    And the last thing. Libreoffice is not very lightweight but it has good quality. You can also find notable unoconv tool.

As of January 2019, there is docx-wasm, which works in node and performs the conversion locally where node is installed. Proprietary but freemium.