Convert scientific notation to float when using OpenRowSet to import a .CSV file

amarcy picture amarcy · Sep 19, 2011 · Viewed 28.2k times · Source

I am using openrowset to import a csv file into SQL Server. One of the columns in the csv file contains numbers in scientific notation (1.08E+05) and the column in the table it is being inserted

By default it is importing the value as 1 and ignoring the .08E+05.

I have tried using cast() and convert() to convert the value directly when the query is executed as well as setting up the datatype in the table as a character string and importing it as such. All of these methods have the same behavior where the .08E+05 is ignored.

Is there a way to have the value imported as 108000 instead of 1 without the .08E+05 without having to change the csv file itself?

Setting up the datatype as a varchar and reading in the csv file appears to have the same effect with the following code:

CREATE TABLE #dataTemp (StartDate datetime, Value varchar(12))

SET @insertDataQuery = 'SELECT Date, CSVValue from OpenRowset(''MSDASQL'', ''Driver={Microsoft Text Driver (*.txt; *.csv)}; DefaultDir=' 
SET @insertDataQuery = @insertDataQuery + 'C:\Data\;'',''SELECT * FROM '+ '11091800.csv' + ''')'

INSERT INTO #dataTemp EXEC(@insertDataQuery)

SELECT * FROM #dataTemp

Not all of the values in the CSV file have the scientific notation and the value without it, e.g. 81000 come across without issue.

Answer

Aaron Bertrand picture Aaron Bertrand · Sep 19, 2011

For BULK INSERT methodologies I've often found it simpler to first move the data into a table of all varchars, then get rid of extraneous things like quoted delimiters and fix formatting. I remember having a heck of a time getting rid of the scientific notation, you can just play with the varchar table until you get it right. I remember attempting all kinds of precision/scale combinations until I finally found one that was compatible. I think for me it was FLOAT then DECIMAL(24,12)...

SELECT CONVERT(DECIMAL(24, 12), CONVERT(FLOAT, '1.08E+05'));

EDIT adding what I did to try to repro and/or demonstrate a less convoluted way.

I created a very simple CSV file:

StartDate,Value
20110808,81000
20110808,1.08E+05

Then I ran the following code (for some reason I can't get MSDASQL to run on my machine to save my life):

CREATE TABLE #dataTemp(StartDate DATETIME, Value VARCHAR(32));

BULK INSERT #dataTemp FROM 'C:\data\whatever.csv' 
    WITH (ROWTERMINATOR='\n', FIELDTERMINATOR=',', FIRSTROW = 2);

SELECT * FROM #dataTemp
GO
SELECT StartDate, CONVERT(INT, CONVERT(FLOAT, Value)) FROM #dataTemp;
GO
DROP TABLE #dataTemp;

Results:

StartDate               Value
----------------------- --------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 1.08E+05

StartDate               (No column name)
----------------------- ----------------
2011-08-08 00:00:00.000 81000
2011-08-08 00:00:00.000 108000