I have been debugging for hours why my code randomly breaks with this error: JSONDecodeError: Expecting value: line 1 column 1 (char 0)
This is the code I have:
while True:
try:
submissions = requests.get('http://reymisterio.net/data-dump/api.php/submission?filter[]=form,cs,'+client+'&filter[]=date,cs,'+since).json()['submission']['records']
break
except requests.exceptions.ConnectionError:
time.sleep(100)
And I've been debugging by printing requests.get(url)
and requests.get(url).text
and I have encountered the following "special "cases:
requests.get(url)
returns a successful 200 response and requests.get(url).text
returns html. I have read online that this should fail when using requests.get(url).json()
, because it won't be able to read the html, but somehow it doesn't break. Why is this?
requests.get(url)
returns a successful 200 response and requests.get(url).text
is in json format. I don't understand why when it goes to the requests.get(url).json()
line it breaks with the JSONDecodeError?
The exact value of requests.get(url).text
for case 2 is:
{
"submission": {
"columns": [
"pk",
"form",
"date",
"ip"
],
"records": [
[
"21197",
"mistico-form-contacto-form",
"2018-09-21 09:04:41",
"186.179.71.106"
]
]
}
}
Looking at the documentation for this API it seems the only responses are in JSON format, so receiving HTML is strange. To increase the likelihood of receiving a JSON response, you can set the 'Accept' header to 'application/json'.
I tried querying this API many times with parameters and did not encounter a JSONDecodeError
. This error is likely the result of another error on the server side. To handle it, except
a json.decoder.JSONDecodeError
in addition to the ConnectionError
error you currently except
and handle this error in the same way as the ConnectionError
.
Here is an example with all that in mind:
import requests, json, time, random
def get_submission_records(client, since, try_number=1):
url = 'http://reymisterio.net/data-dump/api.php/submission?filter[]=form,cs,'+client+'&filter[]=date,cs,'+since
headers = {'Accept': 'application/json'}
try:
response = requests.get(url, headers=headers).json()
except (requests.exceptions.ConnectionError, json.decoder.JSONDecodeError):
time.sleep(2**try_number + random.random()*0.01) #exponential backoff
return get_submission_records(client, since, try_number=try_number+1)
else:
return response['submission']['records']
I've also wrapped this logic in a recursive function, rather than using while
loop because I think it is semantically clearer. This function also waits before trying again using exponential backoff (waiting twice as long after each failure).
Edit: For Python 2.7, the error from trying to parse bad json is a ValueError
, not a JSONDecodeError
import requests, time, random
def get_submission_records(client, since, try_number=1):
url = 'http://reymisterio.net/data-dump/api.php/submission?filter[]=form,cs,'+client+'&filter[]=date,cs,'+since
headers = {'Accept': 'application/json'}
try:
response = requests.get(url, headers=headers).json()
except (requests.exceptions.ConnectionError, ValueError):
time.sleep(2**try_number + random.random()*0.01) #exponential backoff
return get_submission_records(client, since, try_number=try_number+1)
else:
return response['submission']['records']
so just change that except
line to include a ValueError
instead of json.decoder.JSONDecodeError
.