I'm using the OleDbConnection class to retrieve data from an Excel 2000/2003 workbook:
string connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + filename + ";" +
"Extended Properties=\"Excel 8.0;IMEX=1\";";
OleDbConnection connection = new OleDbConnection(connectionString);
connection.Open();
// code to get table name from schema omitted
var dataAdapter = new OleDbDataAdapter(string.Format("SELECT * FROM [{0}]", name),connection);
var myDataSet = new DataSet();
dataAdapter.Fill(myDataSet, "ExcelInfo");
Now it turns out that cells in the worksheet with length greater than 255 characters are being truncated. Is this a limitation in the Microsoft.Jet.OLEDB-provider, or is there something I can do about it?
Anyone?
The OLEDB provider for excel will attempt to automatically determine the DataTypes based off of the first 8 rows of data, this can be set with the HDR=Yes/No property in the connection string. Additionally, there are multiple types that it can apply to text columns. The memo type holds over 255 characters, so if none of the first 8 rows have that then it will incorrectly set the data type.
The way to change this is by changing a registry setting called TypeGuessRows, as described here: Microsoft Support
NOTE: The valid range of values for the TypeGuessRows key is 0 to 16. However, if the value is 0, the number of source rows scanned is 16384. So if you have a very large file make sure the biggest rows are first.