I am setting up a rabbitmq cluster and ran into an issue during the one step in the process. Its straight out of the rabbitmq clustering guide.
root@celery:~# rabbitmqctl status
Status of node celery@celery ...
[{pid,20410},
{running_applications,[{rabbit,"RabbitMQ","2.5.1"},
{os_mon,"CPO CXC 138 46","2.2.4"},
{sasl,"SASL CXC 138 11","2.1.8"},
{mnesia,"MNESIA CXC 138 12","4.4.12"},
{stdlib,"ERTS CXC 138 10","1.16.4"},
{kernel,"ERTS CXC 138 10","2.13.4"}]},
{os,{unix,linux}},
{erlang_version,"Erlang R13B03 (erts-5.7.4) [source] [64-bit] [smp:4:4] [rq:4] [async-threads:30] [hipe] [kernel-poll:true]\n"},
{memory,[{total,25296704},
{processes,9680280},
{processes_used,9662720},
{system,15616424},
{atom,1099393},
{atom_used,1082732},
{binary,89768},
{code,11606637},
{ets,726848}]}]
...done.
root@celery:~# rabbitmqctl cluster_status
Cluster status of node celery@celery ...
[{nodes,[{disc,[celery@celery]}]},{running_nodes,[celery@celery]}]
...done.
root@celery:~# rabbitmqctl stop_app
Stopping node celery@celery ...
...done.
root@celery:~# rabbitmqctl reset
Resetting node celery@celery ...
...done.
root@celery:~# rabbitmqctl cluster worker1@worker1
Clustering node celery@celery with [worker1@worker1] ...
Error: {failed_to_cluster_with,[worker1@worker1],
"Mnesia could not connect to some nodes."}
What are the possible reasons one node wouldn't be able to connect to another?
Here's the guide I'm following: http://www.rabbitmq.com/clustering.html
I jumped into the #rabbitmq channel on freenode. Here's the discussion that followed:
14:29 shakakai: hey all, i'm having a little issue with clustering rabbitmq http://stackoverflow.com/questions/6948624/mnesia-cant-connect-to-another-node
14:30 shakakai: has anyone run into that problem before?
14:30 daysmen has left IRC (Read error: Connection reset by peer)
14:30 antares_: shakakai: make sure that epmd is running on every node
14:30 antares_: shakakai: and that port it uses (4369) is open in your firewall
14:31 |Blaze|: shakakai: is your dns correct? Can you ping worker1 from celery and celery from worker1
14:31 shakakai: |Blaze|: hmm...i'll check
14:31 daysmen has joined ([email protected])
14:32 shakakai: |Blaze|: this is where I'm a little confused, the rabbitmq nodename is worker1@worker1 but the fqdn to ping the box is "ping worker1.mydomain.com"
14:33 |Blaze|: can you "ping worker1"
14:34 shakakai: |Blaze|: no
14:34 |Blaze|: k, you'll need to fix that
14:34 hyperboreean has left IRC (Ping timeout: 250 seconds)
14:37 shakakai: |Blaze|: gotcha, so I setup a hosts file and i should be good
14:37 |Blaze|: yup
14:37 |Blaze|: in both directions
TL;DR
Make sure you can ping the rabbit nodename from each of the boxes you are clustering. If you can't, setup a hosts file for each rabbit nodename.