
andrew at beekhof
Aug 5, 2012, 9:58 PM
Post #2 of 2
(203 views)
Permalink
|
|
Re: stonith not triggered on resource failure
[In reply to]
|
|
On Thu, Aug 2, 2012 at 2:32 AM, Cal Heldenbrand <cal [at] fbsdata> wrote: > Hi everyone, > > I'm starting to get my memcached cluster setup more operational now. But > I'm running into one small problem -- when my memcached resource check > fails, the stonith primitive isn't triggered to reset the node. It only > happens when it's loaded up enough to cause corosync to fail. When the > stonith does fire, it resets the node correctly. > > Here's the relevant snippets of my config. fence_virsh is used just for my > testing environment of Xen VMs. > > ------------------------------------------------------------------------------------------------------------------------ > node mem1 > node mem2 > node mem3 > primitive mem1-xen-host stonith:fence_virsh \ > op monitor interval="1s" timeout="5s" \ > params ipaddr="vmhost1" login="root" action="reboot" > identity_file="/root/.ssh/id_dsa" port="mem1" pcmk_host_list="mem1" > pcmk_host_check="static-list" pcmk_host_map="" verbose="true" > debug="/var/log/vmhost1.log" \ > meta is-managed="true" > primitive memcached ocf:fbs:memcached \ > meta is-managed="true" \ > op monitor interval="1s" timeout="1s" > clone mem1-xen-host-clone mem1-xen-host \ > meta target-role="Started" > clone memcached_clone memcached \ > params ordered="false" \ > meta target-role="Started" migration-threshold="1" > > # stonith device for mem1 should never run on mem1 > location st-mem1-not-on-mem1 mem1-xen-host-clone -inf: mem1 > > # ensure ip-mem1 has a working memcache > colocation ip-mem1-on-memcache inf: cluster-ip-mem1 memcached_clone > > # ensure ip-mem2 does not live on the same node as ip-mem1 > # UNLESS the other 2 nodes are down. > colocation ip-mem2-not-on-ip-mem1 -10000: cluster-ip-mem2 cluster-ip-mem1 > ----------------------------------------------------------------------------------------------------------------------------- > > And here's what the cluster status looks like when the memcached service > check is failing, but the node is still up. add on-fail=fence to the memcached monitor op definition. seems a little severe though :-) > > ----------------------------------------------------------------------------------------------------------------------------- > Online: [ mem1 mem2 mem3 ] > > cluster-ip-mem2 (ocf::heartbeat:IPaddr2): Started mem2 > cluster-ip-mem1 (ocf::heartbeat:IPaddr2): Started mem3 > Clone Set: memcached_clone [memcached] > Started: [ mem2 mem3 ] > Stopped: [ memcached:2 ] > Clone Set: mem1-xen-host-clone [mem1-xen-host] > Started: [ mem2 mem3 ] > Stopped: [ mem1-xen-host:2 ] > ----------------------------------------------------------------------------------------------------------------------------- > > What configuration directive can I add that would force the stonith event to > run when the memcached_clone is stopped? > > Thank you! > > --Cal > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker [at] oss > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker [at] oss http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
|