I am following this very excellent tutorial: https://github.com/binblee/springcloud-swarm
When I deploy a stack to a Docker swarm that contains a single node (just the manager node), it works perfectly.
docker stack deploy -c all-in-one.yml springcloud-demo
I have four docker containers, one of them is Eureka service discovery, which all the other three containers register with successfully.
The problem is when I add a worker node to the swarm, then two of the containers will be deployed to the worker, and two to the manager, and the services deployed to the worker node cannot find the Eureka server.
java.net.UnknownHostException: eureka: Name does not resolve
This is my compose file:
version: '3'
services:
eureka:
image: demo-eurekaserver
ports:
- "8761:8761"
web:
image: demo-web
environment:
- EUREKA_SERVER_ADDRESS=http://eureka:8761/eureka
zuul:
image: demo-zuul
environment:
- EUREKA_SERVER_ADDRESS=http://eureka:8761/eureka
ports:
- "8762:8762"
bookservice:
image: demo-bookservice
environment:
- EUREKA_SERVER_ADDRESS=http://eureka:8761/eureka
Also, I can only access the Eureka Service Discovery server on the host on which it is deployed to.
I thought that using "docker stack deploy" automatically creates an overlay network, in which all exposed ports will be routed to a host on which the respective service is running:
From https://docs.docker.com/engine/swarm/ingress/ :
All nodes participate in an ingress routing mesh. The routing mesh enables each node in the swarm to accept connections on published ports for any service running in the swarm, even if there’s no task running on the node.
This is the output of docker service ls:
manager:~/springcloud-swarm/compose$ docker service ls
ID NAME MODE REPLICAS IMAGE PORTS
rirdysi0j4vk springcloud-demo_bookservice replicated 1/1 demo-bookservice:latest
936ewzxwg82l springcloud-demo_eureka replicated 1/1 demo-eurekaserver:latest *:8761->8761/tcp
lb1p8nwshnvz springcloud-demo_web replicated 1/1 demo-web:latest
0s52zecjk05q springcloud-demo_zuul replicated 1/1 demo-zuul:latest *:8762->8762/tcp
and of docker stack ps springcloud-demo:
manager:$ docker stack ps springcloud-demo
ID NAME IMAGE NODE DESIRED STATE CURRENT STATE
o8aed04qcysy springcloud-demo_web.1 demo-web:latest workernode Running Running 2 minutes ago
yzwmx3l01b94 springcloud-demo_eureka.1 demo-eurekaserver:latest managernode Running Running 2 minutes ago
rwe9y6uj3c73 springcloud-demo_bookservice.1 demo-bookservice:latest workernode Running Running 2 minutes ago
iy5e237ca29o springcloud-demo_zuul.1 demo-zuul:latest managernode Running Running 2 minutes ago
UPDATE:
I successfully added another host, but now I can't add a third. I tried a couple of times, following the same steps, (installing docker, opening the requisite ports, joining the swarm) - but the node cannot find the Eureka server with the container host name).
UPDATE 2:
In testing that the ports were opened, I examined the firewall config:
workernode:~$ sudo ufw status
Status: active
To Action From
-- ------ ----
8080 ALLOW Anywhere
4789 ALLOW Anywhere
7946 ALLOW Anywhere
2377 ALLOW Anywhere
8762 ALLOW Anywhere
8761 ALLOW Anywhere
22 ALLOW Anywhere
However - when I try to hit port 2377 on the worker node from the manager node, I can't:
managernode:~$ telnet xx.xx.xx.xx 2377
Trying xx.xx.xx.xx...
telnet: Unable to connect to remote host: Connection refused
Let us break the solution into parts. Each part tries to give you an idea about the solution and is interconnected with each other.
Whenever we create a container without specifying network, docker attaches it to default bridge network. According to this,. Service discovery is unavailable in the default network. Hene in order to maker service discovery work properly we are supposed to create a user-defined network as it provides isolation, DNS resolution and many more features. All these things are applicable when we use docker run
command.
When docker-compose is used to run a container and network is not specified, it creates its own bridge network. which has all the properties of the user-defined networks.
These bridge networks are not attachable by default, But they allow docker containers in the local machine to connect to them.
In Docker swarm and swarm mode routing mesh Whenever we deploy a service to it without specifying an external network it connects to the ingress network.
When you specify an external overlay network you can notice that the created overlay network will be available only to the manager and not in the worker node unless a service is created and is replicated to it. These are also not attachable by default and does not allow other containers outside swarm services to connect to them. So you don't need to declare a network as attachable until you connect a container to it outside swarm.
As there is no pre defined/official limit on no of worker/manager nodes, You should be able to connect from the third node. One possibility is that the node might be connected as a worker node but you might try to deploy a container in that node which is restricted by the worker node if the overlay network is not attachable.
And moreover, you can't deploy a service directly in the worker node. All the services are deployed in the manager node and it takes care of replicating and scaling the services based on config and mode provided.
As mentioned in Getting started with swarm mode
- TCP port 2377 for cluster management communications
- TCP and UDP port 7946 for communication among nodes
- UDP port 4789 for overlay network traffic
- ip protocol 50 (ESP) for encrypted overlay network
These ports should be whitelisted for communication between nodes. Most firewalls need to be reloaded once you make changes. This can be done by passing reload option to the firewall and it varies between Linux distributions. ufw
doesn't need to be reloaded but needs commit if rules are added in file.
Apart from whitelisting the above ports. You may need to whitelist docker0,docker_gw_bridge,br-123456 ip address with netmask of 16. Else service discovery will not work in same host machine. i.e If you are trying to connect to eureka
in 192.168.0.12 where the eureka
service is in same 192.168.0.12 it will not resolve as firewall will block the traffic. Refer this (NO ROUTE TO HOST network request from container to host-ip:port published from other container)
Sometimes Java works weird such that it throws java.net.MalformedURLException
and similar exceptions. I've my own experience of such case with the solution as well. Here ping resolved properly but Java rmi was throwing an error. So, You can define your own custom alias when you attach to a user-defined network.
By default, you can resolve to a service by using container name. Apart from that, you can also resolve a service as <container_name>.<network_name>
. Of course, you can define alias as well. And even you can resolve it as <alias_name>.<network_name>
.
So you should create a user-defined overlay network after joining the swarm and then should deploy services. In the services, You should mention the external network as defined here along with making changes in the firewall.
If you want to allow external containers to connect to the network you should make the network attachable.
Since you haven't provided enough details on what's happening with third server. I assume that you are trying to deploy a container there which is denied by docker overlay network as the network is not attachable.