Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

pacemaker/stonith running "amok"

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


r.bhatia at ipax

Nov 5, 2008, 9:33 AM

Post #1 of 2 (477 views)
Permalink
pacemaker/stonith running "amok"

hi,

first off, please find the hb_report at [1].

what i did to my 2 node cluster (wc01, wc02)

> wc02# crm_standby -l reboot -N wc01 -v true
i verified that wc01 was in standby and (at least i think) the resources
have been migrated off from wc01.

> wc01# apt-get -u dist-upgrade
upgraded apache2

> wc01# sync;sync;reboot
rebootet wc01 as i thought "-l reboot" will make wc01 rejoin after the
reboot.

wc01 came up but was still considered in standby mode. all of a sudden,
the cluster continuously rebooted wc02 until i finally moved wc01
out of standbymode with:

> #wc01: crm_standby -v off -N wc01 -l reboot

can any1 please explain what i did wrong?

cheers,
raoul

[1] http://ip52.ipax.at/~raoul/cluster/hb_standby_reboots.tar.gz
--
____________________________________________________________________
DI (FH) Raoul Bhatia M.Sc. email. r.bhatia [at] ipax
Technischer Leiter

IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at
Barawitzkagasse 10/2/2/11 email. office [at] ipax
1190 Wien tel. +43 1 3670030
FN 277995t HG Wien fax. +43 1 3670030 15
____________________________________________________________________

_______________________________________________
Pacemaker mailing list
Pacemaker [at] clusterlabs
http://list.clusterlabs.org/mailman/listinfo/pacemaker


beekhof at gmail

Nov 7, 2008, 5:33 AM

Post #2 of 2 (439 views)
Permalink
Re: pacemaker/stonith running "amok" [In reply to]

On Nov 5, 2008, at 6:33 PM, Raoul Bhatia [IPAX] wrote:

> hi,
>
> first off, please find the hb_report at [1].
>
> what i did to my 2 node cluster (wc01, wc02)
>
>> wc02# crm_standby -l reboot -N wc01 -v true
> i verified that wc01 was in standby and (at least i think) the
> resources
> have been migrated off from wc01.
>
>> wc01# apt-get -u dist-upgrade
> upgraded apache2
>
>> wc01# sync;sync;reboot
> rebootet wc01 as i thought "-l reboot" will make wc01 rejoin after the
> reboot.
>
> wc01 came up but was still considered in standby mode. all of a
> sudden,
> the cluster continuously rebooted wc02 until i finally moved wc01
> out of standbymode with:
>
>> #wc01: crm_standby -v off -N wc01 -l reboot
>
> can any1 please explain what i did wrong?

The logs don't go back far enough to say.

At 18:18:05 the PE is invoked and sees that wc02 is failed and starts
to shoot it - but there is no record of it leaving the ccm.
Then all the stonith commands fail - you might want to check the script.

But there is no record at all of wc01 rebooting or wc02's reaction
when it returns.

_______________________________________________
Pacemaker mailing list
Pacemaker [at] clusterlabs
http://list.clusterlabs.org/mailman/listinfo/pacemaker

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.