Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: OpenStack: Dev

running HA cluster of guests within openstack

 

 

OpenStack dev RSS feed   Index | Next | Previous | View Threaded


ikke at iki

Apr 13, 2012, 2:31 AM

Post #1 of 10 (414 views)
Permalink
running HA cluster of guests within openstack

I likely am not the first one to ask this, but since I didn't find a
thread about it I start one.

Is there any shared experience available what are the capabilities of
OpenStack to run cluster of guests in the cloud? Do you have
experience of the following questions, or links to more info? The
questions relate to running a legacy HA cluster in virtual env, and
moving it into cloud...

1. Private networks between guests
-> Doable now using Quantum
1.1. Defining VLANs visible to guest machines to separate clusters
internal traffic,
VLAN tags should not be stripped by host (QinQ)
1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
traffic within the guest cluster (layer2 addressing)
- will Melange do this, according to docs it's not in plans?
2. HA capabilities
2.1. Failure notification times need to be fast, i.e. no tcp timeout allowed
- there seems to be some activity to integrate pacemaker
2.2. Failure notification of both guests and hosts needs to be included
2.3. Guest cluster controller should be able to monitor the states,
and get fast notifications of the events.
- rather in milliseconds than in seconds
- basically the host should have parent of the guest pid notifying
of a child process failure.
- Host should have a virtual watch-dog noticing of a guest being stuck
2.4. Failure recovery time, how fast can OS bring up failed guest?
- any measurements of time from failure to noticing it,
and time that the guest is restarted and back up?
2.5. virtual HW manager (guest isolation)
- Any plans to integrate a piece from which a state of guest could
be reliably queried, e.g. guaranteeing that if I ask to power
off another
guest, it get's done in given time (millisecs), and not
pending on e.g. some tcp
timeout, and thus leading to split brain case of running two
similar guest
simultaneously. E.g. starting another guest to replace shut
down one, but
due some communications error the first one didn't really shut
before the
new one is already up.
- should be able to reliably cut down the guests network and disk access to
guarantee the above case
2.6. Shared disks
- Could there be a shared scsi device concept for the legacy HW
abstraction?
- Qemu/KVM supports this, what would it take to make OS to understand
such disk devices?
2.7. Isolation of redundant nodes
- In some cases there are nodes that need to backup each others 2N, N+1,
there should be a way to make sure they run on different host.
- This project might be aiming for that?
http://wiki.openstack.org/DistributedScheduler

This was something from top of my head, it would be interesting to
hear your thoughts about the issues. This need is coming from the
telco world, which would need a telco-cloud with such more real-time
features in it. Certainly the same applies to many other legacy
environments too.

BR,

Ilkka Tengvall

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


P at draigBrady

Apr 13, 2012, 4:53 AM

Post #2 of 10 (406 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On 04/13/2012 10:31 AM, ikke wrote:
> I likely am not the first one to ask this, but since I didn't find a
> thread about it I start one.
>
> Is there any shared experience available what are the capabilities of
> OpenStack to run cluster of guests in the cloud? Do you have
> experience of the following questions, or links to more info? The
> questions relate to running a legacy HA cluster in virtual env, and
> moving it into cloud...

I'll just point out two early stage projects
that used in combination can provide a HA solution.

http://wiki.openstack.org/Heat
http://wiki.openstack.org/ResourceMonitorAlertsandNotifications

These are similar to AWS CloudFormations and CloudWatch respectively.

cheers,
Pádraig.

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


major at mhtx

Apr 13, 2012, 4:57 AM

Post #3 of 10 (401 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On Apr 13, 2012, at 4:31 AM, ikke wrote:

> 2.5. virtual HW manager (guest isolation)
> - Any plans to integrate a piece from which a state of guest could
> be reliably queried, e.g. guaranteeing that if I ask to power
> off another
> guest, it get's done in given time (millisecs), and not
> pending on e.g. some tcp
> timeout, and thus leading to split brain case of running two
> similar guest
> simultaneously. E.g. starting another guest to replace shut
> down one, but
> due some communications error the first one didn't really shut
> before the
> new one is already up.
> - should be able to reliably cut down the guests network and disk access to
> guarantee the above case

This would be a huge win for clustering.

Having a reliable and immediate STONITH capability within a virtual environment would be really handy for environments which have sensitive needs for shared storage (whether it's remote iscsi storage or DRBD). It would be relatively trivia to assemble a fencing daemon to make requests to the API to hard reboot a misbehaving member of a cluster.

Good points!

--
Major Hayden
_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


jkoelker at rackspace

Apr 13, 2012, 7:45 AM

Post #4 of 10 (405 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On Fri, 2012-04-13 at 12:31 +0300, ikke wrote:

> 1. Private networks between guests
> -> Doable now using Quantum
> 1.1. Defining VLANs visible to guest machines to separate clusters
> internal traffic,
> VLAN tags should not be stripped by host (QinQ)

VLANs and Quantum private networks are pretty much the same thing, why
would you want both?

> 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
> traffic within the guest cluster (layer2 addressing)
> - will Melange do this, according to docs it's not in plans?

If you send the mac address to Melange when you create the interface it
will record it for that instance:

http://melange.readthedocs.org/en/latest/apidoc.html#interfaces

Happy Hacking!

7-11


_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


martin.loschwitz at hastexo

Apr 13, 2012, 7:54 AM

Post #5 of 10 (400 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

Hi Ikke,

great work! :-)

Am 13.04.12 11:31, schrieb ikke:
> I likely am not the first one to ask this, but since I didn't find a
> thread about it I start one.
>
> Is there any shared experience available what are the capabilities of
> OpenStack to run cluster of guests in the cloud? Do you have
> experience of the following questions, or links to more info? The
> questions relate to running a legacy HA cluster in virtual env, and
> moving it into cloud...
>
> 1. Private networks between guests
> [...]
>
> BR,
>
> Ilkka Tengvall
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to : openstack [at] lists
> Unsubscribe : https://launchpad.net/~openstack
> More help : https://help.launchpad.net/ListHelp

I think, as Major pointed out already, that the biggest problem right now
is that there is a certain lack of easy-to-use STONITH solutions to trigger
STONITH events from within virtual machines. I have something cooking here
using the latest version of Pacemaker; should this turn out to work, it
would make many things a lot easier. I'll elaborate a little bit more on
this once I have it working the way I want it.

Concerning the general subject of virtual machines (and clustered VMs for
that matter) within OpenStack, I think there is some stuff missing in Nova
that would be necessary (granted -- in one way or another, it would be
possible to make Pacemaker deal with VMs that have failed within Nova, but
in my eyes, that'd be crazy). Nova knows what VMs are supposed to be there
and Nova can find out which VMs are in fact running and which are not, so
I think Nova should make sure that those VMs that are supposed to run are,
well, running :)

Best regards
Martin

--
Martin Gerhard Loschwitz
Chief Brand Officer, Principal Consultant
hastexo Professional Services

CONFIDENTIALITY NOTICE: This e-mail and/or the accompanying documents
are privileged and confidential under applicable law. The person who
receives this message and who is not the addressee, one of his employees
or an agent entitled to hand it over to the addressee, is informed that
he may not use, disclose or reproduce the contents thereof. Should you
have received this e-mail (or any copy thereof) in error, please let us
know by telephone or e-mail without delay and delete the message from
your system. Thank you.

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ikke at iki

Apr 15, 2012, 11:43 PM

Post #6 of 10 (414 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On Fri, Apr 13, 2012 at 5:45 PM, Jason Kölker <jkoelker [at] rackspace> wrote:
> On Fri, 2012-04-13 at 12:31 +0300, ikke wrote:
>
>> 1. Private networks between guests
>>   -> Doable now using Quantum
>> 1.1. Defining VLANs visible to guest machines to separate clusters
>> internal traffic,
>>        VLAN tags should not be stripped by host (QinQ)
>
> VLANs and Quantum private networks are pretty much the same thing, why
> would you want both?

For legacy reasons. The cluster at the moment handles the cluster
internal network with VLANs, and for such the cloud layer should just
virtualize the HW functionality. It would need to provide the VLAN
layer for guests for the time being until the guest could be modified
not to require it and handle VLAN network configuration via OpenStack
interfaces instead.

Some of the questions are due the legacy need. OpenStack would offer
similar functionality, but if you intend to bring a legacy apps as
such into cloud, there is plenty of modifications needed to adapt the
legacy SW into cloud concepts. Adaptation takes time, and in some
cases it might be cheaper & faster to adapt the cloud layer to provide
legacy HW as virtualized, HW abstraction layer.

While talking about legacy SW, I mean HUGE amount of code written over
decades, which is not easily modifiable.

>> 1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
>>        traffic within the guest cluster (layer2 addressing)
> If you send the mac address to Melange when you create the interface it
> will record it for that instance:
>
> http://melange.readthedocs.org/en/latest/apidoc.html#interfaces

Thanks for the link, it is exactly what I was looking for!

-it

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ikke at iki

Apr 15, 2012, 11:57 PM

Post #7 of 10 (403 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On Fri, Apr 13, 2012 at 2:53 PM, Pádraig Brady <P [at] draigbrady> wrote:
> On 04/13/2012 10:31 AM, ikke wrote:
> I'll just point out two early stage projects
> that used in combination can provide a HA solution.
>
> http://wiki.openstack.org/Heat
> http://wiki.openstack.org/ResourceMonitorAlertsandNotifications
> cheers,
> Pádraig.

Thanks for the links, I'll look into them. It looks good having a
pluggable monitoring interface. By a quick look I don't see how do the
local driver connect to libvirt, is the alert notified in fast manner
or based on periodic polling. I need to take a further look into it.

Hopefully there could be local HW watchdog emulated in Qemu that would
somehow be connected to the plugin framework to allow fast reaction
times to guest being stuck.

Also, it would make sense to have some kind of a local decision done
immediately about the reboot of a stuck guest, instead of taking time
to report it centrally and wait for the central manager decision.

cheers,
Ilkka

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ikke at iki

Apr 16, 2012, 12:16 AM

Post #8 of 10 (400 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On Fri, Apr 13, 2012 at 5:54 PM, Martin Gerhard Loschwitz
<martin.loschwitz [at] hastexo> wrote:
> STONITH events from within virtual machines. I have something cooking here
> using the latest version of Pacemaker; should this turn out to work, it
> would make many things a lot easier. I'll elaborate a little bit more on
> this once I have it working the way I want it.
>
> Concerning the general subject of virtual machines (and clustered VMs for
> that matter) within OpenStack, I think there is some stuff missing in Nova
> that would be necessary (granted -- in one way or another, it would be
> possible to make Pacemaker deal with VMs that have failed within Nova, but
> in my eyes, that'd be crazy). Nova knows what VMs are supposed to be there
> and Nova can find out which VMs are in fact running and which are not, so
> I think Nova should make sure that those VMs that are supposed to run are,
> well, running :)
>
> Best regards
> Martin

Good to hear, I'm looking forward hearing more of you project. It
sounds like it would be plug-in to the earlier mentioned project:

http://wiki.openstack.org/ResourceMonitorAlertsandNotifications

You are right, one cannot have too many "heads" making the decisions
about HA in the cluster, Nova should handle it, or let someone else to
do it and let go. But isn't that exactly what nova is there for, so
it's nova's job.

BR,
it

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


ikke at iki

Apr 16, 2012, 4:41 AM

Post #9 of 10 (406 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

One item more into HA features, hot plugging.

2.8. Hot plug pre-warning events.
- Nova should tell the registered client that a node/guest is going to
be shutoff,
and the remote entry would be given time to ack that.

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp


P at draigBrady

Jun 27, 2012, 3:50 PM

Post #10 of 10 (366 views)
Permalink
Re: running HA cluster of guests within openstack [In reply to]

On 04/13/2012 12:53 PM, Pádraig Brady wrote:
> On 04/13/2012 10:31 AM, ikke wrote:
>> I likely am not the first one to ask this, but since I didn't find a
>> thread about it I start one.
>>
>> Is there any shared experience available what are the capabilities of
>> OpenStack to run cluster of guests in the cloud? Do you have
>> experience of the following questions, or links to more info? The
>> questions relate to running a legacy HA cluster in virtual env, and
>> moving it into cloud...
>
> I'll just point out two early stage projects
> that used in combination can provide a HA solution.
>
> http://wiki.openstack.org/Heat
> http://wiki.openstack.org/ResourceMonitorAlertsandNotifications
>
> These are similar to AWS CloudFormations and CloudWatch respectively.

I notice Heat V4 has just been released.

Here is some additional info on High Availability:
https://github.com/heat-api/heat/wiki/Roadmap-Feature:-High-Availability

and some notes on using it in its current form:
https://github.com/heat-api/heat/wiki/Using-HA

cheers,
Pádraig.

_______________________________________________
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp

OpenStack dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.