Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Dev

nova-compute won't restart (on some nodes) after Grizzly upgrade

 

 

OpenStack dev RSS feed   Index | Next | Previous | View Threaded


jon at jonproulx

Aug 7, 2013, 2:35 PM

Post #1 of 4 (35 views)
Permalink
nova-compute won't restart (on some nodes) after Grizzly upgrade

Hi All,

Apologies to those who saw this on the operators list earlier, there is a
bit of new info here & having gotten no response there thought I'd take it
to a wider audience...

I'm almost through my grizzly upgrade. I'd upgraded everything except
nova-compute before upgrading that (ubuntu 12.04 cloud archieve pkgs).

On most nodes the nova-compute service upgraded and restarted properly, but
on some it imediately exits with:

CRITICAL nova [-] 'instance_type_memory_mb'

It would seem like this is https://code.launchpad.net/bugs/1161022 but the
fix for that was released in March and I've verified is in the packaged
version I'm using.

The referenced bug involves the DB migration only updating non-deleted
instances in the instance-system-metatata table and the patch skips the
lookups that are broken (and irrelevant) for deleted instances.

Tracing the DB calls from the host shows it is trying to do lookups for
instances that were deleted last October, which is a bit surprising as it's
run thousands of instances since & it's not looking those up.

It is note worthy that that is around the time I upgraded from Essex ->
Folsom so it's possible their state is weirder than most having run
through that update.

There were directories for the instances in question in
/var/lib/nova/instances, so I thought "Aha!" and moved them, but on restart
I still get the same failure and same DB query for the old instances. Where
is nova getting the idea it should look these up & how can I stop it?

I've go so far as to generate instance_type_<foo> entries in the
instance_system_metadata table for all instances ever on my deployment
(about 500k) but I still only have the cryptic "CRITICAL nova [-]
'instance_type_memory_mb'" error and a failure to start, so clearly I'm
casing the wrong problem some how.

Help?
-Jon


mikal at stillhq

Aug 7, 2013, 6:02 PM

Post #2 of 4 (30 views)
Permalink
Re: nova-compute won't restart (on some nodes) after Grizzly upgrade [In reply to]

Johnathan,

this would be easier to debug with a nova-compute log. Are you willing
to post one somewhere that people could take a look at?

Thanks,
Michael

On Thu, Aug 8, 2013 at 7:35 AM, Jonathan Proulx <jon [at] jonproulx> wrote:
> Hi All,
>
> Apologies to those who saw this on the operators list earlier, there is a
> bit of new info here & having gotten no response there thought I'd take it
> to a wider audience...
>
>
> I'm almost through my grizzly upgrade. I'd upgraded everything except
> nova-compute before upgrading that (ubuntu 12.04 cloud archieve pkgs).
>
> On most nodes the nova-compute service upgraded and restarted properly, but
> on some it imediately exits with:
>
> CRITICAL nova [-] 'instance_type_memory_mb'
>
> It would seem like this is https://code.launchpad.net/bugs/1161022 but the
> fix for that was released in March and I've verified is in the packaged
> version I'm using.
>
> The referenced bug involves the DB migration only updating non-deleted
> instances in the instance-system-metatata table and the patch skips the
> lookups that are broken (and irrelevant) for deleted instances.
>
> Tracing the DB calls from the host shows it is trying to do lookups for
> instances that were deleted last October, which is a bit surprising as it's
> run thousands of instances since & it's not looking those up.
>
> It is note worthy that that is around the time I upgraded from Essex ->
> Folsom so it's possible their state is weirder than most having run through
> that update.
>
> There were directories for the instances in question in
> /var/lib/nova/instances, so I thought "Aha!" and moved them, but on restart
> I still get the same failure and same DB query for the old instances. Where
> is nova getting the idea it should look these up & how can I stop it?
>
> I've go so far as to generate instance_type_<foo> entries in the
> instance_system_metadata table for all instances ever on my deployment
> (about 500k) but I still only have the cryptic "CRITICAL nova [-]
> 'instance_type_memory_mb'" error and a failure to start, so clearly I'm
> casing the wrong problem some how.
>
> Help?
> -Jon
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack [at] lists
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>



--
Rackspace Australia

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack [at] lists
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack


jon at jonproulx

Aug 11, 2013, 5:17 AM

Post #3 of 4 (16 views)
Permalink
Re: nova-compute won't restart (on some nodes) after Grizzly upgrade [In reply to]

Hi Michael,

Thanks for the offer. I'd be happy to paste up some compute logs if you
have a interest, but I got around the issue with:

virsh list --all

and then 'virsh undefine' for all deleted instances on each host. I've
used hypervisors directly and high level stuff like openstack (and others)
but never spent much time at the libvirt layer so that was a bit of new
info for me apparrently from the operators list not long after I sent my
query here.

Thanks,
-Jon


On Wed, Aug 7, 2013 at 9:02 PM, Michael Still <mikal [at] stillhq> wrote:

> Johnathan,
>
> this would be easier to debug with a nova-compute log. Are you willing
> to post one somewhere that people could take a look at?
>
> Thanks,
> Michael
>
> On Thu, Aug 8, 2013 at 7:35 AM, Jonathan Proulx <jon [at] jonproulx> wrote:
> > Hi All,
> >
> > Apologies to those who saw this on the operators list earlier, there is a
> > bit of new info here & having gotten no response there thought I'd take
> it
> > to a wider audience...
> >
> >
> > I'm almost through my grizzly upgrade. I'd upgraded everything except
> > nova-compute before upgrading that (ubuntu 12.04 cloud archieve pkgs).
> >
> > On most nodes the nova-compute service upgraded and restarted properly,
> but
> > on some it imediately exits with:
> >
> > CRITICAL nova [-] 'instance_type_memory_mb'
> >
> > It would seem like this is https://code.launchpad.net/bugs/1161022 but
> the
> > fix for that was released in March and I've verified is in the packaged
> > version I'm using.
> >
> > The referenced bug involves the DB migration only updating non-deleted
> > instances in the instance-system-metatata table and the patch skips the
> > lookups that are broken (and irrelevant) for deleted instances.
> >
> > Tracing the DB calls from the host shows it is trying to do lookups for
> > instances that were deleted last October, which is a bit surprising as
> it's
> > run thousands of instances since & it's not looking those up.
> >
> > It is note worthy that that is around the time I upgraded from Essex ->
> > Folsom so it's possible their state is weirder than most having run
> through
> > that update.
> >
> > There were directories for the instances in question in
> > /var/lib/nova/instances, so I thought "Aha!" and moved them, but on
> restart
> > I still get the same failure and same DB query for the old instances.
> Where
> > is nova getting the idea it should look these up & how can I stop it?
> >
> > I've go so far as to generate instance_type_<foo> entries in the
> > instance_system_metadata table for all instances ever on my deployment
> > (about 500k) but I still only have the cryptic "CRITICAL nova [-]
> > 'instance_type_memory_mb'" error and a failure to start, so clearly I'm
> > casing the wrong problem some how.
> >
> > Help?
> > -Jon
> >
> > _______________________________________________
> > Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> > Post to : openstack [at] lists
> > Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> >
>
>
>
> --
> Rackspace Australia
>


mikal at stillhq

Aug 13, 2013, 12:59 AM

Post #4 of 4 (3 views)
Permalink
Re: nova-compute won't restart (on some nodes) after Grizzly upgrade [In reply to]

Jonathan, sorry for the slow reply. I had a baby on Friday last week
instead of keeping up with email. I promise it wont happen again. ;)

Did you manage these instances in virsh manually at all as part of the
upgrade? If not, I'd love you to file a bug with a log to show the
problem.

Thanks,
Michael

On Sun, Aug 11, 2013 at 10:17 PM, Jonathan Proulx <jon [at] jonproulx> wrote:
> Hi Michael,
>
> Thanks for the offer. I'd be happy to paste up some compute logs if you
> have a interest, but I got around the issue with:
>
> virsh list --all
>
> and then 'virsh undefine' for all deleted instances on each host. I've used
> hypervisors directly and high level stuff like openstack (and others) but
> never spent much time at the libvirt layer so that was a bit of new info for
> me apparrently from the operators list not long after I sent my query here.
>
> Thanks,
> -Jon
>
>
> On Wed, Aug 7, 2013 at 9:02 PM, Michael Still <mikal [at] stillhq> wrote:
>>
>> Johnathan,
>>
>> this would be easier to debug with a nova-compute log. Are you willing
>> to post one somewhere that people could take a look at?
>>
>> Thanks,
>> Michael
>>
>> On Thu, Aug 8, 2013 at 7:35 AM, Jonathan Proulx <jon [at] jonproulx> wrote:
>> > Hi All,
>> >
>> > Apologies to those who saw this on the operators list earlier, there is
>> > a
>> > bit of new info here & having gotten no response there thought I'd take
>> > it
>> > to a wider audience...
>> >
>> >
>> > I'm almost through my grizzly upgrade. I'd upgraded everything except
>> > nova-compute before upgrading that (ubuntu 12.04 cloud archieve pkgs).
>> >
>> > On most nodes the nova-compute service upgraded and restarted properly,
>> > but
>> > on some it imediately exits with:
>> >
>> > CRITICAL nova [-] 'instance_type_memory_mb'
>> >
>> > It would seem like this is https://code.launchpad.net/bugs/1161022 but
>> > the
>> > fix for that was released in March and I've verified is in the packaged
>> > version I'm using.
>> >
>> > The referenced bug involves the DB migration only updating non-deleted
>> > instances in the instance-system-metatata table and the patch skips the
>> > lookups that are broken (and irrelevant) for deleted instances.
>> >
>> > Tracing the DB calls from the host shows it is trying to do lookups for
>> > instances that were deleted last October, which is a bit surprising as
>> > it's
>> > run thousands of instances since & it's not looking those up.
>> >
>> > It is note worthy that that is around the time I upgraded from Essex ->
>> > Folsom so it's possible their state is weirder than most having run
>> > through
>> > that update.
>> >
>> > There were directories for the instances in question in
>> > /var/lib/nova/instances, so I thought "Aha!" and moved them, but on
>> > restart
>> > I still get the same failure and same DB query for the old instances.
>> > Where
>> > is nova getting the idea it should look these up & how can I stop it?
>> >
>> > I've go so far as to generate instance_type_<foo> entries in the
>> > instance_system_metadata table for all instances ever on my deployment
>> > (about 500k) but I still only have the cryptic "CRITICAL nova [-]
>> > 'instance_type_memory_mb'" error and a failure to start, so clearly I'm
>> > casing the wrong problem some how.
>> >
>> > Help?
>> > -Jon
>> >
>> > _______________________________________________
>> > Mailing list:
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> > Post to : openstack [at] lists
>> > Unsubscribe :
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >
>>
>>
>>
>> --
>> Rackspace Australia
>
>



--
Rackspace Australia

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack [at] lists
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

OpenStack dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.