I'm trying to load data from a CSV file into a MySQL database, and noticed that a large number of records seem to be skipped when I import the file.
The data comes from a Government source, and is very oddly formatted with single quotes, etc in unusual places. Here's a sample of a record not getting inserted:
"'050441'","STANFORD HOSPITAL","CA","H_HSP_RATING_7_8","How do patients rate the hospital overall?","Patients who gave a rating of'7' or '8' (medium)","22","300 or more","37",""
This record, however, does get inserted:
"'050441'","STANFORD HOSPITAL","CA","H_HSP_RATING_0_6","How do patients rate the hospital overall?","Patients who gave a rating of '6' or lower (low)","8","300 or more","37",""
The SQL I'm using to load the data is here:
mysql> load data infile "c:\\HQI_HOSP_HCAHPS_MSR.csv" into table hospital_qualit
y_scores fields terminated by "," enclosed by '"' lines terminated by "\n" IGNOR
E 1 LINES;
The format of the table I'm loading the data into is as follows:
delimiter $$
CREATE TABLE `hospital_quality_scores` (
`ProviderNumber` varchar(8) NOT NULL,
`HospitalName` varchar(50) DEFAULT NULL,
`State` varchar(2) DEFAULT NULL,
`MeasureCode` varchar(25) NOT NULL,
`Question` longtext,
`AnswerDescription` longtext,
`AnswerPercent` int(11) DEFAULT NULL,
`NumberofCompletedSurveys` varchar(50) DEFAULT NULL,
`SurveyResponseRatePercent` varchar(50) DEFAULT NULL,
`Footnote` longtext,
PRIMARY KEY (`ProviderNumber`,`MeasureCode`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8$$
Anyone have any ideas why this is happening? It seems that only have of the records are actually being inserted correctly.
Could it be your primary key is preventing the additional data from being inserted?
Look for a record that has been inserted with a ProviderNumber of "'050441'" and a MeasureCode of "H_HSP_RATING_7_8", if you have one of those, then it is a duplicate key problem.
You may need to add "AnswerDescription" to the primary key to get round this issue.
Regards,
Dave