How to convert CSV file to multiline JSON?

BeanBagKing picture BeanBagKing · Oct 31, 2013 · Viewed 228.6k times · Source

Here's my code, really simple stuff...

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
out = json.dumps( [ row for row in reader ] )
jsonfile.write(out)

Declare some field names, the reader uses CSV to read the file, and the filed names to dump the file to a JSON format. Here's the problem...

Each record in the CSV file is on a different row. I want the JSON output to be the same way. The problem is it dumps it all on one giant, long line.

I've tried using something like for line in csvfile: and then running my code below that with reader = csv.DictReader( line, fieldnames) which loops through each line, but it does the entire file on one line, then loops through the entire file on another line... continues until it runs out of lines.

Any suggestions for correcting this?

Edit: To clarify, currently I have: (every record on line 1)

[{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"},{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}]

What I'm looking for: (2 records on 2 lines)

{"FirstName":"John","LastName":"Doe","IDNumber":"123","Message":"None"}
{"FirstName":"George","LastName":"Washington","IDNumber":"001","Message":"Something"}

Not each individual field indented/on a separate line, but each record on it's own line.

Some sample input.

"John","Doe","001","Message1"
"George","Washington","002","Message2"

Answer

SingleNegationElimination picture SingleNegationElimination · Oct 31, 2013

The problem with your desired output is that it is not valid json document,; it's a stream of json documents!

That's okay, if its what you need, but that means that for each document you want in your output, you'll have to call json.dumps.

Since the newline you want separating your documents is not contained in those documents, you're on the hook for supplying it yourself. So we just need to pull the loop out of the call to json.dump and interpose newlines for each document written.

import csv
import json

csvfile = open('file.csv', 'r')
jsonfile = open('file.json', 'w')

fieldnames = ("FirstName","LastName","IDNumber","Message")
reader = csv.DictReader( csvfile, fieldnames)
for row in reader:
    json.dump(row, jsonfile)
    jsonfile.write('\n')