We're searching a programm that allows us to convert a doc or docx document to a txt file. We're working with linux and we want to start a website that converts user uploaded doc files. We don't wanna use open office/libre office cause we have bad experience with that. Pandoc can't handle doc files :/
Anyone have a idea?
You will have to use two different command-line tools, depending if you are working with .doc or .docx format.
For .doc use catdoc:
catdoc foo.doc > foo.txt
For .docx use docx2txt:
docx2txt foo.docx
The latter will produce a file called foo.txt in the same directory as the original.
I'm not sure which Linux distribution you are using, but both catdoc and docx2txt are available from the Ubuntu repositories, for example:
apt-get install docx2txt
Or with Homebrew on Mac:
brew install docx2txt