Convert doc to txt via commandline

user698601 picture user698601 · Jun 28, 2011 · Viewed 15.4k times · Source

We're searching a programm that allows us to convert a doc or docx document to a txt file. We're working with linux and we want to start a website that converts user uploaded doc files. We don't wanna use open office/libre office cause we have bad experience with that. Pandoc can't handle doc files :/

Anyone have a idea?

Answer

harlandski picture harlandski · Nov 12, 2016

You will have to use two different command-line tools, depending if you are working with .doc or .docx format.

For .doc use catdoc:

catdoc foo.doc > foo.txt

For .docx use docx2txt:

docx2txt foo.docx

The latter will produce a file called foo.txt in the same directory as the original.

I'm not sure which Linux distribution you are using, but both catdoc and docx2txt are available from the Ubuntu repositories, for example:

apt-get install docx2txt

Or with Homebrew on Mac:

brew install docx2txt