I'm currently doing a lot of stuff with BigQuery, and am using a lot of try... except...
. It looks like just about every error I get back from BigQuery is a apiclient.errors.HttpError, but with different strings attached to them, i.e.:
<HttpError 409 when requesting https://www.googleapis.com/bigquery/v2/projects/some_id/datasets/some_dataset/tables?alt=json returned "Already Exists: Table some_id:some_dataset.some_table">
<HttpError 404 when requesting https://www.googleapis.com/bigquery/v2/projects/some_id/jobs/sdfgsdfg?alt=json returned "Not Found: Job some_id:sdfgsdfg">
among many others. Right now the only way I see to handle these is to run regexs on the error messages, but this is messy and definitely not ideal. Is there a better way?
BigQuery is a REST API, the errors it uses follow standard HTTP error conventions.
In python, an HttpError has a resp.status field that returns the HTTP status code. As you show above, 409 is 'conflict', 404 is 'not found'.
For example:
from googleapiclient.errors import HttpError
try:
...
except HttpError as err:
# If the error is a rate limit or connection error,
# wait and try again.
if err.resp.status in [403, 500, 503]:
time.sleep(5)
else: raise
The response is also a json object, an even better way is to parse the json and read the error reason field:
if err.resp.get('content-type', '').startswith('application/json'):
reason = json.loads(err.content).get('error').get('errors')[0].get('reason')
This can be: notFound, duplicate, accessDenied, invalidQuery, backendError, resourcesExceeded, invalid, quotaExceeded, rateLimitExceeded, timeout, etc.