I'm trying to insert some data into a table from a csv document which has all of the fields delimited with ""
ie.
APPLICANTID,NAME,CONTACT,PHONENO,MOBILENO,FAXNO,EMAIL,ADDR1,ADDR2,ADDR3,STATE,POSTCODE
"3","Snoop Dogg","Snoop Dogg","411","","","","411 High Street","USA
","","USA", "1111" "4","LL Cool J","LL Cool J","","","","","5 King
Street","","","USA","1111"
I am using an xml format file to try and overcome the "" delimiters as I believe I would have to update the data again after importing to remove the inital " if it did not.
My format file looks like the following:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="NCharTerm" TERMINATOR='",' MAX_LENGTH="12"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="10" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR=',"' COLLATION="Latin1_General_CI_AS"/>
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="Latin1_General_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="APPLICANTID" xsi:type="SQLINT"/>
<COLUMN SOURCE="2" NAME="NAME" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="CONTACT" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="PHONENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="6" NAME="FAXNO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="7" NAME="EMAIL" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="8" NAME="ADDR1" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="9" NAME="ADDR2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="10" NAME="ADDR3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="11" NAME="STATE" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="12" NAME="POSTCODE" xsi:type="SQLCHAR"/>
</ROW>
</BCPFORMAT>
and I am running the import with the following:
BULK INSERT [PracticalDB].dbo.applicant
FROM 'C:\temp.csv'
WITH (KEEPIDENTITY, FORMATFILE='C:\temp.xml', FIRSTROW = 2)
I am getting the error:
Msg 4864, Level 16, State 1, Line 1 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (APPLICANTID).
for all of the rows.
I have tried various different combinations for the terminator including using:
TERMINATOR="","
TERMINATOR="\","
TERMINATOR='","
TERMINATOR='\","
and none of them seem to work.
Is there a correct way to escape the " so that it will be parsed correctly, assuming that that is my problem here.
Ok so I figured it out!
You can use ' instead of " when you are defining the xml attributes ie TERMINATOR='', then you can use the " within them without worrying.
Also I needed to eat the first " with a field so the other columns could be parsed correctly. This ended up with the format file
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR='"' />
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="10" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="12" xsi:type="CharTerm" TERMINATOR='","' />
<FIELD ID="13" xsi:type="CharTerm" TERMINATOR='"\r\n' />
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="APPLICANTID" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="3" NAME="NAME" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="4" NAME="CONTACT" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="PHONENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="6" NAME="MOBILENO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="7" NAME="FAXNO" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="8" NAME="EMAIL" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="9" NAME="ADDR1" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="10" NAME="ADDR2" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="11" NAME="ADDR3" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="12" NAME="STATE" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="13" NAME="POSTCODE" xsi:type="SQLCHAR"/>
</ROW>
</BCPFORMAT>
Where the first field is just a throw away one to remove the first " and the other fields all separate on "," and the final separates on "(newline)