Can someone help me in understanding the Shared Strings in MS Excel? I tried to understand using some blogs but could not get complete idea. Everyone is explaining how to access Shared String using Open XML and where the Shared Strings stored (as sharedStrings.xml
). Accessing using API is fine. But,
I tried following.
Shared strings is basically a space saving mechanism. As for your questions:
A1. You can't manually create shared strings using the Excel user interface. That's because Excel by default always store any text as a shared string.
A2. As mentioned it's a space saving mechanism. Excel 2007/2010/2013 uses the Open XML format, which is basically a bunch of XML files zipped together. It might also be for ease of referencing. You just have to refer to an index, just like you refer to an index of an array of strings. (But XML is inherently verbose, so I suspect it's for space saving purposes).
Let's say you have the text "This is a very long string" in cell A1 of sheet "FirstSheet". Let's say you also have the same text in cell B7 of sheet "SecondSheet". Excel stores "This is a very long text" in the shared strings table as one entry, say index 5. In "FirstSheet" cell A1, the Open XML SDK class Cell will contain just "5" as the CellValue. In "SecondSheet" cell B7, the SDK class Cell will also contain "5".
Basically, the CellValue only holds the index to the shared string table. This is how you save space. The assumption is that text is duplicated within the worksheet as well as across different worksheets.
A3. Go for shared strings if you understand how to make it work. If not, just set the actual text in the Cell class for CellValue (Cell.DataType as CellValues.String instead of CellValues.SharedString).