Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Operators

VM status keeps ACTIVE after its nova-compute host is powered off

 

 

OpenStack operators RSS feed   Index | Next | Previous | View Threaded


kimi.zhang at nsn

Aug 7, 2013, 12:11 AM

Post #1 of 6 (35 views)
Permalink
VM status keeps ACTIVE after its nova-compute host is powered off

Hi,

Currently if a nova-compute host is down not gracefully, e.g. a power cut. Its hosting VMs still have ACTIVE status in Database.

Even though nova knows the compute node is down, it does nothing to VMs status.

Should Nova be improved to update those VMs status to something else(e.g. ERROR) other than confusing "ACTIVE" ??




Kimi Zhang
+86 186 0800 8182


matt at nycresistor

Aug 7, 2013, 12:19 AM

Post #2 of 6 (34 views)
Permalink
Re: VM status keeps ACTIVE after its nova-compute host is powered off [In reply to]

imagine a scenario in which rabbitmq were to fail to poll the nova-compute
service and put the system into an inactive state... yet, the instances
were still up and operating as expected.

i am not sure there is a way for openstack to ensure down or up in the
scenario in which message bus communications are severed. so i don't see a
benefit either assuming down or coasting with previous values until
otherwise updated.

-matt


On Wed, Aug 7, 2013 at 12:11 AM, Zhang, Kimi (NSN - CN/Cheng Du) <
kimi.zhang [at] nsn> wrote:

> Hi, ****
>
> ** **
>
> Currently if a nova-compute host is down not gracefully, e.g. a power
> cut. Its hosting VMs still have ACTIVE status in Database.****
>
> ** **
>
> Even though nova knows the compute node is down, it does nothing to VMs
> status.****
>
> ** **
>
> Should Nova be improved to update those VMs status to something else(e.g.
> ERROR) other than confusing “ACTIVE” ??****
>
> ** **
>
> ** **
>
> ** **
>
> ** **
>
> Kimi Zhang****
>
> +86 186 0800 8182****
>
> ** **
>
> ** **
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators [at] lists
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>


kimi.zhang at nsn

Aug 7, 2013, 2:13 AM

Post #3 of 6 (35 views)
Permalink
Re: VM status keeps ACTIVE after its nova-compute host is powered off [In reply to]

Yes, that's a valid scenario.

I am just wondering the VM status from nova api is not "fully" reliable, if some 3rd party tools reply on this to manage VM lifecycle/HA/scaling, they will run into troubles...


Kimi Zhang
+86 186 0800 8182

From: ext matt [mailto:matt [at] nycresistor]
Sent: Wednesday, August 07, 2013 3:19 PM
To: Zhang, Kimi (NSN - CN/Cheng Du)
Cc: openstack-operators [at] lists
Subject: Re: [Openstack-operators] VM status keeps ACTIVE after its nova-compute host is powered off

imagine a scenario in which rabbitmq were to fail to poll the nova-compute service and put the system into an inactive state... yet, the instances were still up and operating as expected.
i am not sure there is a way for openstack to ensure down or up in the scenario in which message bus communications are severed. so i don't see a benefit either assuming down or coasting with previous values until otherwise updated.
-matt

On Wed, Aug 7, 2013 at 12:11 AM, Zhang, Kimi (NSN - CN/Cheng Du) <kimi.zhang [at] nsn<mailto:kimi.zhang [at] nsn>> wrote:
Hi,

Currently if a nova-compute host is down not gracefully, e.g. a power cut. Its hosting VMs still have ACTIVE status in Database.

Even though nova knows the compute node is down, it does nothing to VMs status.

Should Nova be improved to update those VMs status to something else(e.g. ERROR) other than confusing "ACTIVE" ??




Kimi Zhang
+86 186 0800 8182<tel:%2B86%20186%200800%208182>



_______________________________________________
OpenStack-operators mailing list
OpenStack-operators [at] lists<mailto:OpenStack-operators [at] lists>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


matt at nycresistor

Aug 7, 2013, 2:16 AM

Post #4 of 6 (34 views)
Permalink
Re: VM status keeps ACTIVE after its nova-compute host is powered off [In reply to]

That is certainly a potential problem.


On Wed, Aug 7, 2013 at 2:13 AM, Zhang, Kimi (NSN - CN/Cheng Du) <
kimi.zhang [at] nsn> wrote:

> Yes, that’s a valid scenario. ****
>
> ** **
>
> I am just wondering the VM status from nova api is not “fully” reliable,
> if some 3rd party tools reply on this to manage VM lifecycle/HA/scaling,
> they will run into troubles…****
>
> ** **
>
> ** **
>
> Kimi Zhang****
>
> +86 186 0800 8182****
>
> ** **
>
> *From:* ext matt [mailto:matt [at] nycresistor]
> *Sent:* Wednesday, August 07, 2013 3:19 PM
> *To:* Zhang, Kimi (NSN - CN/Cheng Du)
> *Cc:* openstack-operators [at] lists
> *Subject:* Re: [Openstack-operators] VM status keeps ACTIVE after its
> nova-compute host is powered off****
>
> ** **
>
> imagine a scenario in which rabbitmq were to fail to poll the nova-compute
> service and put the system into an inactive state... yet, the instances
> were still up and operating as expected.****
>
> i am not sure there is a way for openstack to ensure down or up in the
> scenario in which message bus communications are severed. so i don't see a
> benefit either assuming down or coasting with previous values until
> otherwise updated.****
>
> -matt****
>
> ** **
>
> On Wed, Aug 7, 2013 at 12:11 AM, Zhang, Kimi (NSN - CN/Cheng Du) <
> kimi.zhang [at] nsn> wrote:****
>
> Hi, ****
>
> ****
>
> Currently if a nova-compute host is down not gracefully, e.g. a power
> cut. Its hosting VMs still have ACTIVE status in Database.****
>
> ****
>
> Even though nova knows the compute node is down, it does nothing to VMs
> status.****
>
> ****
>
> Should Nova be improved to update those VMs status to something else(e.g.
> ERROR) other than confusing “ACTIVE” ??****
>
> ****
>
> ****
>
> ****
>
> ****
>
> Kimi Zhang****
>
> +86 186 0800 8182****
>
> ****
>
> ****
>
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators [at] lists
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators***
> *
>
> ** **
>


robertc at robertcollins

Aug 7, 2013, 4:34 AM

Post #5 of 6 (30 views)
Permalink
Re: VM status keeps ACTIVE after its nova-compute host is powered off [In reply to]

On 7 August 2013 21:16, matt <matt [at] nycresistor> wrote:
> That is certainly a potential problem.

So I think we should separate out 'observed state' from 'intended state'.

Then in a lost-connection-to-hypervisor we can report that the
observed state is 'unknown'. If the hypervisor is hard rebooted we can
report the observed state as down (and automatically restart the vm's
if their intended state is active). And so on....

We ran into a nasty bug with this with bare metal where lost IPMI
polls would lead to powering off physical servers; we solved that
narrowly but in a similar fashion... I think a systematic change of
this sort would be good.

-Rob

--
Robert Collins <rbtcollins [at] hp>
Distinguished Technologist
HP Converged Cloud

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators [at] lists
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


arahal at iweb

Aug 7, 2013, 11:32 AM

Post #6 of 6 (29 views)
Permalink
Re: VM status keeps ACTIVE after its nova-compute host is powered off [In reply to]

Hi,

Le 2013-08-07 07:34, Robert Collins a écrit :
> On 7 August 2013 21:16, matt <matt [at] nycresistor> wrote:
>> That is certainly a potential problem.
>
> So I think we should separate out 'observed state' from 'intended state'.

This is interesting but the more I think of it the more I understand the
initial choice of state 'reporting'.

As soon as you start adding the 'observed state' you'd need to add some
kind of timestamp to make sure you're not relying on an 'old' observed
state. Moreover, you'll need to make sure the state gets updated
properly (sensible time frame) as you now guarantee the status'
coherence in (near) real-time.

If a node goes down, the VM state is the best thing to use to know if
you need to evacuate them to another node for instance. So, unless you
come up with another state for this peculiar situation, you're stuck
with 'active'. This is the desired state according to user request, not
a real-time status report on what is actually running on the nodes.

I guess that (near) real-time updates would also have performance
impacts as of the 'state updating' in a DB that's aimed at management.

Somehow, if you want to guarantee the information, you could check
instance and node status, but still may need to come up with other
checks to avoid corner cases ...

Ultimately, the one and best way to make sure your VM is up, is trying
to 'ping' it in some way (all this points back to full monitoring).
Looking at collectd or parameters from ceilometer may help here.

All this is how I understand it for now.

--
=================================================
Ahmed Rahal <arahal [at] iweb> / iWeb Technologies
Spécialiste de l'Architecture TI
/ IT Architecture Specialist
=================================================

_______________________________________________
OpenStack-operators mailing list
OpenStack-operators [at] lists
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

OpenStack operators RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.