How to import data containing double quotes into mongodb using mongoimport?

ciso picture ciso · Jul 22, 2015 · Viewed 9.7k times · Source

I'm using mongoimport to import a csv file. The csv file contains text with imbedded double quotes on the 2nd line.

"id","text"
"1","This is text"
"2","\"This is quoted text\""

This should import as two lines with the 2nd line including the beginning and ending quotes as part of the text. However mongoimport responds with:

c:\mongoimport -d testdb -c testtb --headerline --type csv --drop --file c:/temp1.csv
connected to: localhost
dropping: testdb.testtb
Failed: read error on entry #2: line 3, column 6: extraneous " in field
   imported 0 documents    error "read error: bare " in non-quoted field imported 0 documents.

How do you import csv data containing double quotes within quoted fields? Is there another escape method?

My environment is Windows based.

Answer

Miloš Stanić picture Miloš Stanić · Mar 29, 2016

There is a reference in the Mongoimport docs to this, here https://docs.mongodb.org/v3.0/reference/program/mongoimport/#cmdoption--type

The csv parser accepts that data that complies with RFC RFC 4180. As a result, backslashes are not a valid escape character. If you use double-quotes to enclose fields in the CSV data, you must escape internal double-quote marks by prepending another double-quote.

So to make things clear: instead of escaping double-quotes with a backslash, you need to escape a double-quote with another double-quote, i.e. as a result you need to have two double-quotes.