
andrew at beekhof
Jul 29, 2012, 7:37 PM
Post #4 of 4
(325 views)
Permalink
|
|
Re: 3 node cluster - two nodes get fenced/rebooted when one dies?
[In reply to]
|
|
On Fri, Jul 6, 2012 at 8:25 AM, Errol Neal <eneal [at] businessgrade> wrote: > Hi again. I was hoping to get some insight into why two nodes get rebooted in my cluster when I halt one of of them. > > I'm running corosync 1.1.4 and pacemaker-1.1.6 on CentOS 6.2. I've put my configuration up on pastebin if anyone would like to take a look > > http://pastebin.com/raw.php?i=6cAkJ3Qk Not really enough I'm afraid. We'd need a crm_report archive which has the logs and other data necessary to debug an issue of this kind. > > Could this be related? No. > > ERROR: native_create_actions: Resource st-xenapi-nas1-dev3-fence (stonith::fence_xenapi) is active on 2 nodes attempting recovery > > I noticed that during such times, multiple nodes are running the same resource. Incidentally, even if this isn't the cause, Is there a way to prevent this? Not really, although I have been thinking about how to mask it in the PE. Basically if there is a fencing device active on nodeX that is about to be fenced, under some conditions we start it on nodeY before stopping it on nodeX. This is cheating a little, but is the only way to make progress if nodeY needs it to fence nodeX or another node that failed at the same time. > Thanks in advance.. > > -Errol > > > > > > > > > _______________________________________________ > Pacemaker mailing list: Pacemaker [at] oss > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org _______________________________________________ Pacemaker mailing list: Pacemaker [at] oss http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
|