I need a way to convert .doc
or .docx
extensions to .txt
without installing anything. I also don't want to have to manually open Word to do this obviously. As long as it's running on auto.
I was thinking that either Perl or VBA could do the trick, but I can't find anything online for either.
Any suggestions?
A simple Perl only solution for docx:
Use Archive::Zip to get the word/document.xml
file from your docx
file. (A docx is just a zipped archive.)
Use XML::LibXML to parse it.
Then use XML::LibXSLT to transform it into text or html format. Seach the web to find a nice docx2txt.xsl file :)
Cheers !
J.