Bash: Parse CSV with quotes, commas and newlines

Jacob Horbulyk picture Jacob Horbulyk · Mar 29, 2016 · Viewed 16.4k times · Source

Say I have the following csv file:

 id,message,time
 123,"Sorry, This message
 has commas and newlines",2016-03-28T20:26:39
 456,"It makes the problem non-trivial",2016-03-28T20:26:41

I want to write a bash command that will return only the time column. i.e.

time
2016-03-28T20:26:39
2016-03-28T20:26:41

What is the most straight forward way to do this? You can assume the availability of standard unix utils such as awk, gawk, cut, grep, etc.

Note the presence of "" which escape , and newline characters which make trivial attempts with

cut -d , -f 3 file.csv

futile.

Answer

hek2mgl picture hek2mgl · Mar 29, 2016

As chepner said, you are encouraged to use a programming language which is able to parse csv.

Here comes an example in python:

import csv

with open('a.csv', 'rb') as csvfile:
    reader = csv.reader(csvfile, quotechar='"')
    for row in reader:
        print(row[-1]) # row[-1] gives the last column