I'm trying to set up multi-master replication between two servers according to this tutorial: http://tecadmin.net/setup-mariadb-galera-cluster-5-5-in-centos-rhel/
My /etc/my.cnf.d/server.cnf on 1st server:
[mariadb]
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://XXX.XXX.XXX.9
wsrep_cluster_name='cluster1'
wsrep_node_address='XXX.XXX.XXX.10'
wsrep_node_name='db10'
wsrep_sst_method=rsync
wsrep_sst_auth=wsrep_sst_user:wsrep_sst_pass
and similar at 2nd server:
[mariadb]
query_cache_size=0
binlog_format=ROW
default_storage_engine=innodb
innodb_autoinc_lock_mode=2
wsrep_provider=/usr/lib64/galera/libgalera_smm.so
wsrep_cluster_address=gcomm://XXX.XXX.XXX.10
wsrep_cluster_name='cluster1'
wsrep_node_address='XXX.XXX.XXX.9'
wsrep_node_name='db9'
wsrep_sst_method=rsync
wsrep_sst_auth=wsrep_sst_user:wsrep_sst_pass
On both servers there is mysql user wsrep_sst_user with "grant all".
After executing as a root on the 1st server:
# service mysql bootstrap
I'm getting logs in /var/lib/mysql/HOST.err
140618 10:53:23 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
140618 10:53:23 mysqld_safe WSREP: Running position recovery with --log_error='/var/lib/mysql/wsrep_recovery.qJO4Ec' --pid-file='/var/lib/mysql/HOST-recover.pid'
140618 10:53:25 mysqld_safe WSREP: Recovered position 00000000-0000-0000-0000-000000000000:-1
140618 10:53:25 [Note] WSREP: wsrep_start_position var submitted: '00000000-0000-0000-0000-000000000000:-1'
140618 10:53:25 [Note] WSREP: Read nil XID from storage engines, skipping position init
140618 10:53:25 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/galera/libgalera_smm.so'
140618 10:53:25 [Note] WSREP: wsrep_load(): Galera 25.3.5(r178) by Codership Oy <[email protected]> loaded successfully.
140618 10:53:25 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
140618 10:53:25 [Note] WSREP: Found saved state: 00000000-0000-0000-0000-000000000000:-1
140618 10:53:25 [Note] WSREP: Passing config to GCS: base_host = XXX.XXX.XXX.10; base_port = 4567; cert.log_conflicts = no; debug = no; evs.inactive_check_period = PT0.5S; evs.inactive_timeout = PT15S; evs.join_retrans_period = PT1S; evs.max_install_timeouts = 1; evs.send_window = 4; evs.stats_report_period = PT1M; evs.suspect_timeout = PT5S; evs.user_send_window = 2; evs.view_forget_timeout = PT24H; gcache.dir = /var/lib/mysql/; gcache.keep_pages_size = 0; gcache.mem_size = 0; gcache.name = /var/lib/mysql//galera.cache; gcache.page_size = 128M; gcache.size = 128M; gcs.fc_debug = 0; gcs.fc_factor = 1.0; gcs.fc_limit = 16; gcs.fc_master_slave = no; gcs.max_packet_size = 64500; gcs.max_throttle = 0.25; gcs.recv_q_hard_limit = 9223372036854775807; gcs.recv_q_soft_limit = 0.25; gcs.sync_donor = no; gmcast.segment = 0; gmcast.version = 0; pc.announce_timeout = PT3S; pc.checksum = false; pc.ignore_quorum = false; pc.ignore_sb = false; pc.npvo = false; pc.version = 0; pc.wait_prim = true; pc.wait_prim_timeout = P30S; pc.weight = 1; proton
140618 10:53:25 [Note] WSREP: Service thread queue flushed.
140618 10:53:25 [Note] WSREP: Assign initial position for certification: -1, protocol version: -1
140618 10:53:25 [Note] WSREP: wsrep_sst_grab()
140618 10:53:25 [Note] WSREP: Start replication
140618 10:53:25 [Note] WSREP: Setting initial position to 00000000-0000-0000-0000-000000000000:-1
140618 10:53:25 [Note] WSREP: protonet asio version 0
140618 10:53:25 [Note] WSREP: Using CRC-32C (optimized) for message checksums.
140618 10:53:25 [Note] WSREP: backend: asio
140618 10:53:25 [Note] WSREP: GMCast version 0
140618 10:53:25 [Note] WSREP: (0245da72-f6c6-11e3-ab34-cae23d9ce0ea, 'tcp://0.0.0.0:4567') listening at tcp://0.0.0.0:4567
140618 10:53:25 [Note] WSREP: (0245da72-f6c6-11e3-ab34-cae23d9ce0ea, 'tcp://0.0.0.0:4567') multicast: , ttl: 1
140618 10:53:25 [Note] WSREP: EVS version 0
140618 10:53:25 [Note] WSREP: PC version 0
140618 10:53:25 [Note] WSREP: gcomm: bootstrapping new group 'cluster1'
140618 10:53:25 [ERROR] WSREP: Permission denied
140618 10:53:25 [ERROR] WSREP: failed to open gcomm backend connection: 13: error while trying to listen 'tcp://0.0.0.0:4567?socket.non_blocking=1', asio error 'Permission denied': 13 (Permission denied)
at gcomm/src/asio_tcp.cpp:listen():814
140618 10:53:25 [ERROR] WSREP: gcs/src/gcs_core.c:gcs_core_open():202: Failed to open backend connection: -13 (Permission denied)
140618 10:53:25 [ERROR] WSREP: gcs/src/gcs.c:gcs_open():1291: Failed to open channel 'cluster1' at 'gcomm://XXX.XXX.XXX.9': -13 (Permission denied)
140618 10:53:25 [ERROR] WSREP: gcs connect failed: Permission denied
140618 10:53:25 [ERROR] WSREP: wsrep::connect() failed: 7
140618 10:53:25 [ERROR] Aborting
140618 10:53:25 [Note] WSREP: Service disconnected.
140618 10:53:26 [Note] WSREP: Some threads may fail to exit.
140618 10:53:26 [Note] /usr/sbin/mysqld: Shutdown complete
Server version:
# mysqld --version
mysqld Ver 5.5.37-MariaDB-wsrep for Linux on x86_64 (MariaDB Server, wsrep_25.10.r3980)
I found another solution to this problem. I've updated from Ubuntu 14.04 LTS to Ubuntu 14.10
This happened on all servers!
The final solution (after hours of searching) was to remove " and ' in the cluster configuration file.
eg. before
wsrep_cluster_address="gcomm://10.0.0.4,10.0.0.5"
and after
wsrep_cluster_address=gcomm://10.0.0.4,10.0.0.5
and the error was gone!