My goal is to convert JSON file into a format that can uploaded from Cloud Storage into BigQuery (as described here) with Python.
I have tried using newlineJSON package for the conversion but receives the following error.
JSONDecodeError: Expecting value or ']': line 2 column 1 (char 5)
Does anyone have the solution to this?
Here is the sample JSON code:
[{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
...
"keyN": "valueN"
}
]
And here's the existing python script:
with nlj.open(url_samplejson, json_lib = "simplejson") as src_:
with nlj.open(url_convertedjson, "w") as dst_:
for line_ in src_:
dst_.write(line_)
The answer with jq
is really useful, but if you still want to do it with Python (as it seems from the question), you can do it with built-in json
module.
import json
from io import StringIO
in_json = StringIO("""[{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
},
{
"key01": "value01",
"key02": "value02",
"keyN": "valueN"
}
]""")
result = [json.dumps(record) for record in json.load(in_json)] # the only significant line to convert the JSON to the desired format
print('\n'.join(result))
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
{"key01": "value01", "key02": "value02", "keyN": "valueN"}
* I'm using StringIO
and print
here just to make a sample easier to test locally.
As an alternative, you can use Python jq binding to combine it with the other answer.