Which HTTP status code should I use for a health-check failure?

Paul M Furley picture Paul M Furley · Aug 19, 2014 · Viewed 14.8k times · Source

I'm implementing a /_status/ endpoint which does some sanity checks on data in our database.

For example, we are collecting measurements and the status should go "bad" if the latest measurement is over an hour old.

I would like to point Pingdom at this URL to leverage their alerting infrastructure and tell us when something's wrong.

On a "good" status I will serve an HTML page with an HTTP 200 OK status. But what would an appropriate HTTP status code be for "bad"? Or would it be more correct not to convey this information via status code, but via HTML content instead?

Thanks!

Answer

Paolo picture Paolo · Dec 28, 2017

Well... this is an old question, but I ended up here, so I thought I'd give my two cents here: It seems pretty clear that a 2xx should be returned if all is OK

If health is not OK, I think it should return a 5xx result (4xx talks about the client being at fault in the request; 2xx and 3xx are all successful to some degree).

I think that a 5xx is correct because this is a special request that is answering about the state of the whole service. Also, because most Load Balancers offer liveliness checks based on response codes and not all offer a way to parse a more complex payload (other than perhaps a RegExp Match which can make the check brittle).

I agree with @Julien that a 500 (specifically) doesn't seem appropriate, and we've decided on 503 Service Unavailable.

503 seems to fit for a couple of reasons:

  • It's a 5xx family result code which indicates that something is going on on the server side.
  • It has a temporary nature to it indicating that it may recover.