How to convert a webpage (from an intranet wiki) to an Office document?

Marc-Olivier Titeux picture Marc-Olivier Titeux · Jun 4, 2012 · Viewed 11.6k times · Source

I have a set of Wiki pages (MediaWiki style) on my company's intranet that I would like to convert to Microsoft Office Word documents (or something that I can import in it). I am looking for something that has:

Requirements

  • Keep the formatting as much as it can
  • Does not require to change anything on the server that hosts the Wiki (no plugin can be added nor configuration files can be modified from my side)
  • The solution can be programmatically (as I am a developer too), in flavor of Python/C#/C++ and the like

Exclusions

  • Does not look like a solution as "Wiki to Acrobat PDF Pro to Microsof Office Word" (as we do not have Acrobat PDF Pro). Actually, even the non-Pro version (that allows a "Save as Microsoft Word online" option) is not available in my company (very old version of Adobe suite). However, I can still export the page as a pdf, but from the Wiki we have, it does not look good (because some element are too big, for an A4 format, and the extra parts are scraped out of the produced pdf. I would like them to be included anyway and be able to play with "bad" formatting within Word eventually
  • As it is an intranet wiki, online solutions are out of the scope
  • Solutions that implies I could copy the db of the Wiki and do the operation elsewhere (at home for example) are also out of the scope

Options

  • The solution can be either on Windows or Linux-like (CentOS)
  • If it can do it in batch, it is better, but not required

Question

Would you have any hint of a solution that could fit my needs?

Answer

Dirk Vollmar picture Dirk Vollmar · Jun 4, 2012

A very simple solution is to open the URL of the Wiki in Word's Open Document dialog, e.g. by pasting the URL http://en.wikipedia.org/w/index.php?title=Microsoft_Word&printable=yes into the File Name text box. This does not require any programming, still gives a satisfying result.

If you need a batch solution, you can write a simple script in VBA that creates and saves the documents for you:

Sub OpenFromWiki()

    Documents.Open FileName:= _
        "http://en.wikipedia.org/w/index.php?title=Microsoft_Word&printable=yes", _
         ConfirmConversions:=False, ReadOnly:=True, AddToRecentFiles:=False, _
        PasswordDocument:="", PasswordTemplate:="", Revert:=False, _
        WritePasswordDocument:=""

End Sub