
linuxdatacenter at gmail
Aug 8, 2011, 12:49 AM
Post #1 of 1
(30 views)
Permalink
|
Hi, It looks like my rabbitmq server on nova main node keeps crashing. I keep getting messages like this on my compute nodes: 2011-08-08 09:16:31,816 ERROR nova.rpc [-] Failed to fetch message from queue: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '') (nova.rpc): TRACE: Traceback (most recent call last): (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/nova/rpc.py", line 126, in fetch (nova.rpc): TRACE: super(Consumer, self).fetch(no_ack, auto_ack, enable_callbacks) (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/carrot/messaging.py", line 304, in fetch (nova.rpc): TRACE: message = self.backend.get(self.queue, no_ack=no_ack) (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/carrot/backends/pyamqplib.py", line 252, in get (nova.rpc): TRACE: raw_message = self.channel.basic_get(queue, no_ack=no_ack) (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/amqplib/client_0_8/channel.py", line 2032, in basic_get (nova.rpc): TRACE: (60, 72), # Channel.basic_get_empty (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/amqplib/client_0_8/abstract_channel.py", line 89, in wait (nova.rpc): TRACE: self.channel_id, allowed_methods) (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/amqplib/client_0_8/connection.py", line 218, in _wait_method (nova.rpc): TRACE: self.wait() (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/amqplib/client_0_8/abstract_channel.py", line 105, in wait (nova.rpc): TRACE: return amqp_method(self, args) (nova.rpc): TRACE: File "/usr/lib/pymodules/python2.6/amqplib/client_0_8/connection.py", line 367, in _close (nova.rpc): TRACE: raise AMQPConnectionException(reply_code, reply_text, (class_id, method_id)) (nova.rpc): TRACE: AMQPConnectionException: (320, u"CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", (0, 0), '') Also their status in "nova-manage service list" is: nova-compute enabled XXX When I restart the rabbitmq server, I get this one: 2011-08-08 09:16:34,809 ERROR nova.rpc [-] Reconnected to queue 2011-08-08 09:16:34,810 ERROR nova.rpc [-] Reconnected to queue 2011-08-08 09:16:34,811 ERROR nova.rpc [-] Reconnected to queue Looks like the node is reconnected, but its status is still XXX in nova-compute. Can anyone give me a reasonable remedy for this issue? (the first one I can think of is a periodic restart of the rabbitmq server and nova-compute daemons on all my servers). PS. Searching google for "nova-compute XXX" may render different results depending on your parental filter settings ;-) So it might be a good idea to change it to "OK" or whatever ;-) Regards, -Piotr -- checkout my blog on linux clusters: -- linuxdatacenter.blogspot.com -- -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20110808/9e8fd896/attachment.html>
|