LOAD DATA INFILE does not import all rows in a CSV data source

MarathonStudios picture MarathonStudios · May 19, 2011 · Viewed 8.3k times · Source

I'm trying to load data from a CSV file into a MySQL database, and noticed that a large number of records seem to be skipped when I import the file.

The data comes from a Government source, and is very oddly formatted with single quotes, etc in unusual places. Here's a sample of a record not getting inserted:

"'050441'","STANFORD HOSPITAL","CA","H_HSP_RATING_7_8","How do patients rate the hospital overall?","Patients who gave a rating of'7' or '8' (medium)","22","300 or more","37",""

This record, however, does get inserted:

"'050441'","STANFORD HOSPITAL","CA","H_HSP_RATING_0_6","How do patients rate the hospital overall?","Patients who gave a rating of '6' or lower (low)","8","300 or more","37",""

The SQL I'm using to load the data is here:

mysql> load data infile "c:\\HQI_HOSP_HCAHPS_MSR.csv" into table hospital_qualit
y_scores fields terminated by "," enclosed by '"' lines terminated by "\n" IGNOR
E 1 LINES;

The format of the table I'm loading the data into is as follows:

delimiter $$

CREATE TABLE `hospital_quality_scores` (
  `ProviderNumber` varchar(8) NOT NULL,
  `HospitalName` varchar(50) DEFAULT NULL,
  `State` varchar(2) DEFAULT NULL,
  `MeasureCode` varchar(25) NOT NULL,
  `Question` longtext,
  `AnswerDescription` longtext,
  `AnswerPercent` int(11) DEFAULT NULL,
  `NumberofCompletedSurveys` varchar(50) DEFAULT NULL,
  `SurveyResponseRatePercent` varchar(50) DEFAULT NULL,
  `Footnote` longtext,
  PRIMARY KEY (`ProviderNumber`,`MeasureCode`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8$$

Anyone have any ideas why this is happening? It seems that only have of the records are actually being inserted correctly.

Answer

Dave Rix picture Dave Rix · May 19, 2011

Could it be your primary key is preventing the additional data from being inserted?

Look for a record that has been inserted with a ProviderNumber of "'050441'" and a MeasureCode of "H_HSP_RATING_7_8", if you have one of those, then it is a duplicate key problem.

You may need to add "AnswerDescription" to the primary key to get round this issue.

Regards,

Dave