Using jq, how can arbitrary JSON encoding an array of shallow objects be converted to CSV?
There are plenty of Q&As on this site that cover specific data models which hard-code the fields, but answers to this question should work given any JSON, with the only restriction that it's an array of objects with scalar properties (no deep/complex/sub-objects, as flattening these is another question). The result should contain a header row giving the field names. Preference will be given to answers that preserve the field order of the first object, but it's not a requirement. Results may enclose all cells with double-quotes, or only enclose those that require quoting (e.g. 'a,b').
Input:
[
{"code": "NSW", "name": "New South Wales", "level":"state", "country": "AU"},
{"code": "AB", "name": "Alberta", "level":"province", "country": "CA"},
{"code": "ABD", "name": "Aberdeenshire", "level":"council area", "country": "GB"},
{"code": "AK", "name": "Alaska", "level":"state", "country": "US"}
]
Possible output:
code,name,level,country
NSW,New South Wales,state,AU
AB,Alberta,province,CA
ABD,Aberdeenshire,council area,GB
AK,Alaska,state,US
Possible output:
"code","name","level","country"
"NSW","New South Wales","state","AU"
"AB","Alberta","province","CA"
"ABD","Aberdeenshire","council area","GB"
"AK","Alaska","state","US"
Input:
[
{"name": "bang", "value": "!", "level": 0},
{"name": "letters", "value": "a,b,c", "level": 0},
{"name": "letters", "value": "x,y,z", "level": 1},
{"name": "bang", "value": "\"!\"", "level": 1}
]
Possible output:
name,value,level
bang,!,0
letters,"a,b,c",0
letters,"x,y,z",1
bang,"""!""",0
Possible output:
"name","value","level"
"bang","!","0"
"letters","a,b,c","0"
"letters","x,y,z","1"
"bang","""!""","1"
First, obtain an array containing all the different object property names in your object array input. Those will be the columns of your CSV:
(map(keys) | add | unique) as $cols
Then, for each object in the object array input, map the column names you obtained to the corresponding properties in the object. Those will be the rows of your CSV.
map(. as $row | $cols | map($row[.])) as $rows
Finally, put the column names before the rows, as a header for the CSV, and pass the resulting row stream to the @csv
filter.
$cols, $rows[] | @csv
All together now. Remember to use the -r
flag to get the result as a raw string:
jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | @csv'