MySQL has a nice CSV import function LOAD DATA INFILE
.
I have a large dataset that needs to be imported from CSV on a regular basis, so this feature is exactly what I need. I've got a working script that imports my data perfectly.
.....except.... I don't know in advance what the end-of-line terminator will be.
My SQL code currently looks something like this:
LOAD DATA INFILE '{fileName}'
INTO TABLE {importTable}
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
( {fieldList} );
This works great for some import files.
However, the import data is coming from multiple sources. Some of them have the \n
terminator; others have \r\n
. I can't predict which one I'll have.
Is there a way using LOAD DATA INFILE
to specify that my lines may be terminated with either \n
or \r\n
? How do I deal with this?
You can specify line separator as '\n' and remove trailing '\r' separators if necessary from the last field during loading.
For example -
Suppose we have the 'entries.txt' file. The line separator is '\r\n', and only after line ITEM2 | CLASS3 | DATE2
the separator is '\n':
COL1 | COL2 | COL3
ITEM1 | CLASS1 | DATE1
ITEM2 | CLASS3 | DATE2
ITEM3 | CLASS1 | DATE3
ITEM4 | CLASS2 | DATE4
CREATE TABLE statement:
CREATE TABLE entries(
column1 VARCHAR(255) DEFAULT NULL,
column2 VARCHAR(255) DEFAULT NULL,
column3 VARCHAR(255) DEFAULT NULL
)
Our LOAD DATA INFILE query:
LOAD DATA INFILE 'entries.txt' INTO TABLE entries
FIELDS TERMINATED BY '|'
LINES TERMINATED BY '\n'
IGNORE 1 LINES
(column1, column2, @var)
SET column3 = TRIM(TRAILING '\r' FROM @var);
Show results:
SELECT * FROM entries;
+---------+----------+---------+
| column1 | column2 | column3 |
+---------+----------+---------+
| ITEM1 | CLASS1 | DATE1 |
| ITEM2 | CLASS3 | DATE2 |
| ITEM3 | CLASS1 | DATE3 |
| ITEM4 | CLASS2 | DATE4 |
+---------+----------+---------+