BCP/ Bulk Insert Fails (tab delimited file)

DataRiver picture DataRiver · Dec 9, 2013 · Viewed 20.5k times · Source

I have been trying to import data (tab delimited) into SQL server. The source data is exported from IBM Cognos. Data can be downloaded from: sample data

I have tried BCP / Bulk Insert, but it did not help. The original datafile contains a header row (which needs to be skipped).

================================== Schema:

CREATE TABLE [dbo].[DIM_Assessment](
[QueryType] [nvarchar](4000) NULL,
[QueryDate] [nvarchar](4000) NULL,
[APUID] [nvarchar](4000) NULL,
[AssessmentID] [nvarchar](4000) NULL,
[ICDCode] [nvarchar](4000) NULL,
[ICDName] [nvarchar](4000) NULL,
[LoadDate] [nvarchar](4000) NULL
) ON [PRIMARY]
GO

============================= Format File generated using the following command

bcp [dbname].dbo.dim_assessment format nul -c -f C:\config\dim_assessment.Fmt -S <IP>  -U sa -P Pwd

Content of the format file:

11.0
7
1       SQLCHAR             0       8000    "\t"     1     QueryType                    SQL_Latin1_General_CP1_CI_AS
2       SQLCHAR             0       8000    "\t"     2     QueryDate                    SQL_Latin1_General_CP1_CI_AS
3       SQLCHAR             0       8000    "\t"     3     APUID                        SQL_Latin1_General_CP1_CI_AS
4       SQLCHAR             0       8000    "\t"     4     AssessmentID                 SQL_Latin1_General_CP1_CI_AS
5       SQLCHAR             0       8000    "\t"     5     ICDCode                      SQL_Latin1_General_CP1_CI_AS
6       SQLCHAR             0       8000    "\t"     6     ICDName                      SQL_Latin1_General_CP1_CI_AS
7       SQLCHAR             0       8000    "\r\n"   7     LoadDate                     SQL_Latin1_General_CP1_CI_AS

=============================

I tried importing data using BCP / Bulk Insert, however, non of them worked.

bcp [dbname].dbo.dim_assessment IN C:\dim_assessment.dat -f C:\config\dim_assessment.Fmt -S <IP>  -U sa -P Pwd

BULK INSERT dim_assessment FROM '\\dbserver\DIM_Assessment.dat'
WITH (
  DATAFILETYPE = 'char',
  FIELDTERMINATOR = '\t',
  ROWTERMINATOR = '\r\n'
);
GO

Thank you in advance for your help@

Answer

Aaron Bertrand picture Aaron Bertrand · Dec 9, 2013

Your input file is in a terrible format.

Your format file and your BULK INSERT command both state that the end of a row should be a carriage return and line feed combination, and that there are seven columns of data. However if you open your CSV file in Notepad you will quickly see that the carriage returns and line feeds are not observed correctly in Windows (meaning they must be something other than precisely \r\n). You can also see that there aren't actually seven columns of data, but five:

QueryType   QueryDate   APUID   AssessmentID    ICDCode ICDName LoadDate
PPIC    2013-11-20 10:23:14 11431   10963       Tremors
PPIC    2013-11-20 10:23:14 11431   11299       THUMB PAIN
PPIC    2013-11-20 10:23:14 11431   11348       Environmental allergies
...

Just looking at it visually you can tell it isn't right, and you need to get a better source file before throwing it over the wall at SQL Server and expecting it to handle it smoothly:

enter image description here