I have an all-in-one-setup with my controller and compute services running on the same node.all my nova and other dependent services are up and running. However, when i try to launch an instance the state of the nova-compute process becomes down. Because of this the instance is stuck in spawning state.
> [root@localhost nova(keystone_admin)]# nova service-list
> +----+------------------+-----------------------+----------+---------+-------+----------------------------+-----------------+ | Id | Binary | Host | Zone | Status |
> State | Updated_at | Disabled Reason |
> +----+------------------+-----------------------+----------+---------+-------+----------------------------+-----------------+ | 6 | nova-cert | localhost.localdomain | internal | enabled |
> up | 2016-11-04T07:24:32.000000 | - | | 7 |
> nova-consoleauth | localhost.localdomain | internal | enabled | up
> | 2016-11-04T07:24:32.000000 | - | | 8 | nova-scheduler
> | localhost.localdomain | internal | enabled | up |
> 2016-11-04T07:24:33.000000 | - | | 9 | nova-conductor
> | localhost.localdomain | internal | enabled | up |
> 2016-11-04T07:24:33.000000 | - | | 11 | nova-compute
> | localhost.localdomain | nova | enabled | **down** |
> 2016-11-04T06:43:03.000000 | - | | 12 | nova-console
> | localhost.localdomain | internal | enabled | up |
> 2016-11-04T07:24:32.000000 | - |
====
[root@localhost nova(keystone_admin)]# systemctl status openstack-nova-compute.service -l ● openstack-nova-compute.service - OpenStack Nova Compute Server Loaded: loaded (/usr/lib/systemd/system/openstack-nova-compute.service; enabled; vendor preset: disabled) Active: active (running) since Fri 2016-11-04 12:08:54 IST; 49min ago Main PID: 37586 (nova-compute)
CGroup: /system.slice/openstack-nova-compute.service └─37586 /usr/bin/python2 /usr/bin/nova-computeNov 04 12:08:46 localhost.localdomain systemd[1]: Starting OpenStack Nova Compute Server... Nov 04 12:08:53 localhost.localdomain nova-compute[37586]: Option "verbose" from group "DEFAULT" is deprecated for removal. Its value may be silently ignored in the future. Nov 04 12:08:53 localhost.localdomain nova-compute[37586]: Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications". Nov 04 12:08:54 localhost.localdomain systemd[1]: Started OpenStack Nova Compute Server.
======== The status for the nova compute process is perfectly fine. My rabbitmq service is also running
[root@localhost nova(keystone_admin)]# systemctl status rabbitmq-server ● rabbitmq-server.service - RabbitMQ broker Loaded: loaded (/usr/lib/systemd/system/rabbitmq-server.service; enabled; vendor preset: disabled) Drop-In: /etc/systemd/system/rabbitmq-server.service.d └─limits.conf
Active: active (running) since Thu 2016-11-03 12:32:08 IST; 24h ago Main PID: 1835 (beam.smp) CGroup: /system.slice/rabbitmq-server.service ├─1835 /usr/lib64/erlang/erts-5.10.4/bin/beam.smp -W w -K true -A30 -P 1048576 -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq --... ├─1964 /usr/lib64/erlang/erts-5.10.4/bin/epmd -daemon ├─5873 inet_gethost 4 └─5875 inet_gethost 4
Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: {user,<<"guest">>, Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: [administrator], Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: rabbit_auth_backend_internal,...}, Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: <<"/">>, Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: [{<<...>>,...},{...}], Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: <0.14812.0>,<0.14816.0>]}}, Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: {restart_type,intrinsic}, Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: {shutdown,4294967295}, Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: {child_type,worker}]}]}} Nov 04 12:13:12 localhost.localdomain rabbitmq-server[1835]: function_clause
=======
[root@localhost nova(keystone_admin)]# netstat -anp | grep 5672 | grep 37586 tcp 0 0 10.1.10.22:55628 10.1.10.22:5672
ESTABLISHED 37586/python2 tcp 0 0 10.1.10.22:56204
10.1.10.22:5672 ESTABLISHED 37586/python2 tcp 0 0 10.1.10.22:56959 10.1.10.22:5672 ESTABLISHED 37586/python2 ===== 37586 is the nova-compute process id.
I have checked the logs for nova-compute, nova-api and nova-conductor and there are no errors.
**
2016-11-03 12:24:50.930 2092 ERROR nova.servicegroup.drivers.db DBConnectionError: (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '10.1.10 .22' ([Errno 111] ECONNREFUSED)") 2016-11-03 12:24:53.811 2092 ERROR oslo.messaging._drivers.impl_rabbit [-] AMQP server on 10.1.10.22:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in
** 16 seconds.
======= Can someone suggest what should i do to handle it. As i am on the same node, why are these services not reachable?
If nova-compute is down, there are two possible reasons: a. nova-compute is actually down b. it cannot communicate with rabbit, or nova-conductor cannot communicate with rabbit.
As far as I can see in your logs, you have issue with rabbit: "10.1.10.22:5672 is unreachable". Check if you have rabbit listening on this IP/port. Check if you can connect to rabbit from compute host. I usually use nc 10.1.10.22 5672 to see if there are connection or not.
Check if nova settings for rabbit are correct. Example of correct settings:
[DEFAULT]
rpc_backend=rabbit
rabbit_host=rabbitmq-ip-here
rabbit_port=5672
rabbit_hosts=$rabbit_host:$rabbit_port
rabbit_use_ssl=false
rabbit_userid=guest
rabbit_password=guest
rabbit_login_method=AMQPLAIN
rabbit_virtual_host=/compute
Check logs in the /var/log/nova/*.log
Enable debug=true in the [DEFAULT] section of nova.conf