ikke at iki
Apr 13, 2012, 2:31 AM
Post #1 of 10
I likely am not the first one to ask this, but since I didn't find a
running HA cluster of guests within openstack
thread about it I start one.
Is there any shared experience available what are the capabilities of
OpenStack to run cluster of guests in the cloud? Do you have
experience of the following questions, or links to more info? The
questions relate to running a legacy HA cluster in virtual env, and
moving it into cloud...
1. Private networks between guests
-> Doable now using Quantum
1.1. Defining VLANs visible to guest machines to separate clusters
VLAN tags should not be stripped by host (QinQ)
1.2. Set pre-defined MAC addresses for the guests, needed by non-IP
traffic within the guest cluster (layer2 addressing)
- will Melange do this, according to docs it's not in plans?
2. HA capabilities
2.1. Failure notification times need to be fast, i.e. no tcp timeout allowed
- there seems to be some activity to integrate pacemaker
2.2. Failure notification of both guests and hosts needs to be included
2.3. Guest cluster controller should be able to monitor the states,
and get fast notifications of the events.
- rather in milliseconds than in seconds
- basically the host should have parent of the guest pid notifying
of a child process failure.
- Host should have a virtual watch-dog noticing of a guest being stuck
2.4. Failure recovery time, how fast can OS bring up failed guest?
- any measurements of time from failure to noticing it,
and time that the guest is restarted and back up?
2.5. virtual HW manager (guest isolation)
- Any plans to integrate a piece from which a state of guest could
be reliably queried, e.g. guaranteeing that if I ask to power
guest, it get's done in given time (millisecs), and not
pending on e.g. some tcp
timeout, and thus leading to split brain case of running two
simultaneously. E.g. starting another guest to replace shut
down one, but
due some communications error the first one didn't really shut
new one is already up.
- should be able to reliably cut down the guests network and disk access to
guarantee the above case
2.6. Shared disks
- Could there be a shared scsi device concept for the legacy HW
- Qemu/KVM supports this, what would it take to make OS to understand
such disk devices?
2.7. Isolation of redundant nodes
- In some cases there are nodes that need to backup each others 2N, N+1,
there should be a way to make sure they run on different host.
- This project might be aiming for that?
This was something from top of my head, it would be interesting to
hear your thoughts about the issues. This need is coming from the
telco world, which would need a telco-cloud with such more real-time
features in it. Certainly the same applies to many other legacy
Mailing list: https://launchpad.net/~openstack
Post to : openstack [at] lists
Unsubscribe : https://launchpad.net/~openstack
More help : https://help.launchpad.net/ListHelp