Nginx will not start with host not found in upstream

rinu picture rinu · May 9, 2018 · Viewed 80.2k times · Source

I use nginx to proxy and hold persistent connections to far away servers for me.

I have configured about 15 blocks similar to this example:

upstream rinu-test {
    server test.rinu.test:443;
    keepalive 20;
}
server {
    listen 80;
    server_name test.rinu.test;
    location / {
        proxy_pass https://rinu-test;
        proxy_http_version 1.1;
        proxy_set_header Connection "";
        proxy_set_header Host $http_host;
    }
}

The problem is if the hostname can not be resolved in one or more of the upstream blocks, nginx will not (re)start. I can't use static IPs either, some of these hosts explicitly said not to do that because IPs will change. Every other solution I've seen to this error message says to get rid of upstream and do everything in the location block. That it not possible here because keepalive is only available under upstream.

I can temporarily afford to lose one server but not all 15.

Edit: Turns out nginx is not suitable for this use case. An alternative backend (upstream) keepalive proxy should be used. A custom Node.js alternative is in my answer. So far I haven't found any other alternatives that actually work.

Answer

cnst picture cnst · May 12, 2018

Earlier versions of nginx (before 1.1.4), which already powered a huge number of the most visited websites worldwide (and some still do even nowdays, if the server headers are to be believed), didn't even support keepalive on the upstream side, because there is very little benefit for doing so in the datacentre setting, unless you have a non-trivial latency between your various hosts; see https://serverfault.com/a/883019/110020 for some explanation.

Basically, unless you know you specifically need keepalive between your upstream and front-end, chances are it's only making your architecture less resilient and worse-off.

(Note that your current solution is also wrong because a change in the IP address will likewise go undetected, because you're doing hostname resolution at config reload only; so, even if nginx does start, it'll basically stop working once IP addresses of the upstream servers do change.)

Potential solutions, pick one:

  • The best solution would seem to just get rid of upstream keepalive as likely unnecessary in a datacentre environment, and use variables with proxy_pass for up-to-date DNS resolution for each request (nginx is still smart-enough to still do the caching of such resolutions)

  • Another option would be to get a paid version of nginx through a commercial subscription, which has a resolve parameter for the server directive within the upstream context.

  • Finally, another thing to try might be to use a set variable and/or a map to specify the servers within upstream; this is neither confirmed nor denied to have been implemented; e.g., it may or may not work.