combine word document using python docx

omri_saadon picture omri_saadon · Jul 21, 2014 · Viewed 24.9k times · Source

I have few word files that each have specific content. I would like for a snippet that show me or help me to figure out how to combine the word files into one file, while using Python docx library.

For example in pywin32 library I did the following:

rng = self.doc.Range(0, 0)
for d in data:
    time.sleep(0.05)

    docstart = d.wordDoc.Content.Start
    self.word.Visible = True
    docend = d.wordDoc.Content.End - 1
    location = d.wordDoc.Range(docstart, docend).Copy()
    rng.Paste()
    rng.Collapse(0)
    rng.InsertBreak(win32.constants.wdPageBreak)

But I need to do it while using Python docx library instead of win32.client

Answer

maerteijn picture maerteijn · Nov 8, 2016

I've adjusted the example above to work with the latest version of python-docx (0.8.6 at the time of writing). Note that this just copies the elements (merging styles of elements is more complicated to do):

from docx import Document

files = ['file1.docx', 'file2.docx']

def combine_word_documents(files):
    merged_document = Document()

    for index, file in enumerate(files):
        sub_doc = Document(file)

        # Don't add a page break if you've reached the last file.
        if index < len(files)-1:
           sub_doc.add_page_break()

        for element in sub_doc.element.body:
            merged_document.element.body.append(element)

    merged_document.save('merged.docx')

combine_word_documents(files)