Dealing with commas in a CSV file

csv
Bob The Janitor picture Bob The Janitor · Apr 20, 2009 · Viewed 539.4k times · Source

I am looking for suggestions on how to handle a csv file that is being created, then uploaded by our customers, and that may have a comma in a value, like a company name.

Some of the ideas we are looking at are: quoted Identifiers (value "," values ","etc) or using a | instead of a comma. The biggest problem is that we have to make it easy, or the customer won't do it.

Answer

Corey Trager picture Corey Trager · Apr 20, 2009

For 2017, csv is fully specified - RFC 4180.

It is a very common specification, and is completely covered by many libraries (example).

Simply use any easily-available csv library - that is to say RFC 4180.


There's actually a spec for CSV format and how to handle commas:

Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.

http://tools.ietf.org/html/rfc4180

So, to have values foo and bar,baz, you do this:

foo,"bar,baz"

Another important requirement to consider (also from the spec):

If double-quotes are used to enclose fields, then a double-quote appearing inside a field must be escaped by preceding it with another double quote. For example:

"aaa","b""bb","ccc"