Convert Word doc to HTML programmatically in Java

kaychaks picture kaychaks · Oct 22, 2008 · Viewed 70.5k times · Source

I need to convert a Word document into HTML file(s) in Java. The function will take input an word document and the output will be html file(s) based on the number of pages the word document has i.e. if the word document has 3 pages then there will be 3 html files generated having the required page break.

I searched for open source/non-commercial APIs which can convert doc to html but for no result. Anybody who have done this type of job before please help.

Thanks

Answer

Fisher picture Fisher · Jun 23, 2011

I recommend the JODConverter, It leverages OpenOffice.org, which provides arguably the best import/export filters for OpenDocument and Microsoft Office formats available today.

JODConverter has a lot of documents, scripts, and tutorials to help you out.