Convert Html to Docx in c#

Luis picture Luis · Mar 25, 2011 · Viewed 52.9k times · Source

i want to convert a html page to docx in c#, how can i do it?

Answer

Leonel Sanches da Silva picture Leonel Sanches da Silva · Sep 28, 2015

My solution uses Html2OpenXml along with DocumentFormat.OpenXml (NuGet package for Html2OpenXml is here) to provide an elegant solution for ASP.NET MVC.

WordHelper.cs

public static class WordHelper
{
    public static byte[] HtmlToWord(String html)
    {
        const string filename = "test.docx";
        if (File.Exists(filename)) File.Delete(filename);

        using (MemoryStream generatedDocument = new MemoryStream())
        {
            using (WordprocessingDocument package = WordprocessingDocument.Create(
                   generatedDocument, WordprocessingDocumentType.Document))
            {
                MainDocumentPart mainPart = package.MainDocumentPart;
                if (mainPart == null)
                {
                    mainPart = package.AddMainDocumentPart();
                    new Document(new Body()).Save(mainPart);
                }

                HtmlConverter converter = new HtmlConverter(mainPart);
                Body body = mainPart.Document.Body;

                var paragraphs = converter.Parse(html);
                for (int i = 0; i < paragraphs.Count; i++)
                {
                    body.Append(paragraphs[i]);
                }

                mainPart.Document.Save();
            }

            return generatedDocument.ToArray();
        }
    }
}

Controller

    [HttpPost]
    [ValidateInput(false)]
    public FileResult Demo(CkEditorViewModel viewModel)
    {
        return File(WordHelper.HtmlToWord(viewModel.CkEditorContent),
          "application/vnd.openxmlformats-officedocument.wordprocessingml.document");
    }

I'm using CKEditor to generate HTML for this sample.