How to convert arbitrary simple JSON to CSV using jq?

outis picture outis · Oct 6, 2015 · Viewed 74k times · Source

Using jq, how can arbitrary JSON encoding an array of shallow objects be converted to CSV?

There are plenty of Q&As on this site that cover specific data models which hard-code the fields, but answers to this question should work given any JSON, with the only restriction that it's an array of objects with scalar properties (no deep/complex/sub-objects, as flattening these is another question). The result should contain a header row giving the field names. Preference will be given to answers that preserve the field order of the first object, but it's not a requirement. Results may enclose all cells with double-quotes, or only enclose those that require quoting (e.g. 'a,b').

Examples

  1. Input:

    [
        {"code": "NSW", "name": "New South Wales", "level":"state", "country": "AU"},
        {"code": "AB", "name": "Alberta", "level":"province", "country": "CA"},
        {"code": "ABD", "name": "Aberdeenshire", "level":"council area", "country": "GB"},
        {"code": "AK", "name": "Alaska", "level":"state", "country": "US"}
    ]
    

    Possible output:

    code,name,level,country
    NSW,New South Wales,state,AU
    AB,Alberta,province,CA
    ABD,Aberdeenshire,council area,GB
    AK,Alaska,state,US
    

    Possible output:

    "code","name","level","country"
    "NSW","New South Wales","state","AU"
    "AB","Alberta","province","CA"
    "ABD","Aberdeenshire","council area","GB"
    "AK","Alaska","state","US"
    
  2. Input:

    [
        {"name": "bang", "value": "!", "level": 0},
        {"name": "letters", "value": "a,b,c", "level": 0},
        {"name": "letters", "value": "x,y,z", "level": 1},
        {"name": "bang", "value": "\"!\"", "level": 1}
    ]
    

    Possible output:

    name,value,level
    bang,!,0
    letters,"a,b,c",0
    letters,"x,y,z",1
    bang,"""!""",0
    

    Possible output:

    "name","value","level"
    "bang","!","0"
    "letters","a,b,c","0"
    "letters","x,y,z","1"
    "bang","""!""","1"
    

Answer

user3899165 picture user3899165 · Oct 6, 2015

First, obtain an array containing all the different object property names in your object array input. Those will be the columns of your CSV:

(map(keys) | add | unique) as $cols

Then, for each object in the object array input, map the column names you obtained to the corresponding properties in the object. Those will be the rows of your CSV.

map(. as $row | $cols | map($row[.])) as $rows

Finally, put the column names before the rows, as a header for the CSV, and pass the resulting row stream to the @csv filter.

$cols, $rows[] | @csv

All together now. Remember to use the -r flag to get the result as a raw string:

jq -r '(map(keys) | add | unique) as $cols | map(. as $row | $cols | map($row[.])) as $rows | $cols, $rows[] | @csv'