open xml reading from excel file

altandogan picture altandogan · Apr 16, 2014 · Viewed 92.4k times · Source

I want to implement openXml sdk 2.5 into my project. I do everything in this link

using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Spreadsheet;
using System.IO.Packaging;


static void Main(string[] args)
        {

            String fileName = @"C:\OPENXML\BigData.xlsx";
            // Comment one of the following lines to test the method separately.
            ReadExcelFileDOM(fileName);    // DOM
            //ReadExcelFileSAX(fileName);    // SAX
        }

        // The DOM approach.
        // Note that the code below works only for cells that contain numeric values.
        // 
        static void ReadExcelFileDOM(string fileName)
        {
            using (SpreadsheetDocument spreadsheetDocument = SpreadsheetDocument.Open(fileName, false))
            {
                WorkbookPart workbookPart = spreadsheetDocument.WorkbookPart;
                WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
                SheetData sheetData = worksheetPart.Worksheet.Elements<SheetData>().First();
                string text;

                int rowCount= sheetData.Elements<Row>().Count();

                foreach (Row r in sheetData.Elements<Row>())
                {
                    foreach (Cell c in r.Elements<Cell>())
                    {
                        text = c.CellValue.Text;
                        Console.Write(text + " ");
                    }
                }
                Console.WriteLine();
                Console.ReadKey();
            }
        }

But i am not getting any row. It hasn't entered loop. Note: I also set up openXml sdk 2.5 my computer

And I find below code this is work for numeric value.For string value it writes 0 1 2 ...

 private static void Main(string[] args)
            {
                var filePath = @"C:/OPENXML/BigData.xlsx";
                using (var document = SpreadsheetDocument.Open(filePath, false))
                {
                    var workbookPart = document.WorkbookPart;
                    var workbook = workbookPart.Workbook;

                    var sheets = workbook.Descendants<Sheet>();
                    foreach (var sheet in sheets)
                    {
                        var worksheetPart = (WorksheetPart)workbookPart.GetPartById(sheet.Id);
                        var sharedStringPart = workbookPart.SharedStringTablePart;
                        //var values = sharedStringPart.SharedStringTable.Elements<SharedStringItem>().ToArray();

                        string text;
                        var rows = worksheetPart.Worksheet.Descendants<Row>();
                        foreach (var row in rows)
                        {
                            Console.WriteLine();
                            int count = row.Elements<Cell>().Count();

                            foreach (Cell c in row.Elements<Cell>())
                            {

                                text = c.CellValue.InnerText;

                                Console.Write(text + " ");

                            }
                        }
                    }
                }

                Console.ReadLine();
            }

Answer

shunty picture shunty · Apr 16, 2014

Your approach seemed to work ok for me - in that it did "enter the loop". Nevertheless you could also try something like the following:

void Main()
{
    string fileName = @"c:\path\to\my\file.xlsx";

    using (FileStream fs = new FileStream(fileName, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
    {
        using (SpreadsheetDocument doc = SpreadsheetDocument.Open(fs, false))
        {
            WorkbookPart workbookPart = doc.WorkbookPart;
            SharedStringTablePart sstpart = workbookPart.GetPartsOfType<SharedStringTablePart>().First();
            SharedStringTable sst = sstpart.SharedStringTable;

            WorksheetPart worksheetPart = workbookPart.WorksheetParts.First();
            Worksheet sheet = worksheetPart.Worksheet;

            var cells = sheet.Descendants<Cell>();
            var rows = sheet.Descendants<Row>();

            Console.WriteLine("Row count = {0}", rows.LongCount());
            Console.WriteLine("Cell count = {0}", cells.LongCount());

            // One way: go through each cell in the sheet
            foreach (Cell cell in cells)
            {
                if ((cell.DataType != null) && (cell.DataType == CellValues.SharedString))
                {
                    int ssid = int.Parse(cell.CellValue.Text);
                    string str = sst.ChildElements[ssid].InnerText;
                    Console.WriteLine("Shared string {0}: {1}", ssid, str);
                }
                else if (cell.CellValue != null)
                {
                    Console.WriteLine("Cell contents: {0}", cell.CellValue.Text);
                }
             }

             // Or... via each row
             foreach (Row row in rows)
             {
                 foreach (Cell c in row.Elements<Cell>())
                 {
                     if ((c.DataType != null) && (c.DataType ==           CellValues.SharedString))
                     {
                         int ssid = int.Parse(c.CellValue.Text);
                         string str = sst.ChildElements[ssid].InnerText;
                         Console.WriteLine("Shared string {0}: {1}", ssid, str);
                     }
                     else if (c.CellValue != null)
                     {
                         Console.WriteLine("Cell contents: {0}", c.CellValue.Text);
                     }
                 }
             }
         }
     }
 }

I used the filestream approach to open the workbook because this allows you to open it with shared access - so that you can have the workbook open in Excel at the same time. The Spreadsheet.Open(... method won't work if the workbook is open elsewhere.

Perhaps that is why your code didn't work.

Note, also, the use of the SharedStringTable to get the cell text where appropriate.

EDIT 2018-07-11:

Since this post is still getting votes I should also point out that in many cases it may be a lot easier to use ClosedXML to manipulate/read/edit your workbooks. The documentation examples are pretty user friendly and the coding is, in my limited experience, much more straight forward. Just be aware that it does not (yet) implement all the Excel functions (for example INDEX and MATCH) which may or may not be an issue. [Not that I would want to be trying to deal with INDEX and MATCH in OpenXML anyway.]