Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

maintenance on 2 nodes in a ha linux cluster

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


Jonathan.Ilroy at smals

Jul 2, 2009, 7:45 AM

Post #1 of 3 (282 views)
Permalink
maintenance on 2 nodes in a ha linux cluster

Hi,

I am puzzled by a strange behavior of HA (version 1 with 2 nodes, simple
configuration).
Here's the scenario : 2 nodes (servera and serverb) are running HA on
Linux Debian 5.0.2. All software installed with versions provided by
Debian.
HA is working correctly when one node is made unavailable and brought
back.
However, in this scenario :

1) servera is shutdown (say, for maintenance)
2) serverb is shutdown (again, for maintenance, intentionally creating a
loss of availability)
3) restarting servera (at this point, HA on servera is not restarted, no
heartbeat, no resource, etc. --> ????)
4) restarting serverb (at this point HA, heartbeat is restarted on both
nodes and the cluster works correctly again.)

It is strange that after point 3) HA is not restarted and the resources
are not made available on servera even if serverb is not yet available. As
soon as one node is up it should start the resources. I have also noted
that restarting manually heartbeat after point 3) works. Only the
automatic restart of heartbeat does not work. Of course heartbeat is
configured to start automatically.

Is this a normal behavior of HA v1 ?

Here's ha.cf :
debugfile /var/log/ha-debug
logfile /var/log/ha-log
keepalive 1
deadtime 10
warntime 2
initdead 30
udpport 694
ucast eth1 192.168.2.2
auto_failback on
stonith_host servera.mydomain.com external/xen serverb.mydomain.com
xen-stonith[at]10.32.2.31
stonith_host serverb.mydomain.com external/xen servera.mydomain.com
xen-stonith[at]10.32.2.30
node servera.mydomain.com
node serverb.mydomain.com

And here's haresources :

servera.mydomain.com IPaddr::10.32.2.29/26/eth0 drbddisk::r0 iscsitarget



_______________________________________________
Linux-HA mailing list
Linux-HA[at]lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


dejanmm at fastmail

Jul 2, 2009, 7:52 AM

Post #2 of 3 (263 views)
Permalink
Re: maintenance on 2 nodes in a ha linux cluster [In reply to]

Hi,

On Thu, Jul 02, 2009 at 04:45:37PM +0200, Jonathan.Ilroy[at]smals.be wrote:
> Hi,
>
> I am puzzled by a strange behavior of HA (version 1 with 2 nodes, simple
> configuration).
> Here's the scenario : 2 nodes (servera and serverb) are running HA on
> Linux Debian 5.0.2. All software installed with versions provided by
> Debian.
> HA is working correctly when one node is made unavailable and brought
> back.
> However, in this scenario :
>
> 1) servera is shutdown (say, for maintenance)
> 2) serverb is shutdown (again, for maintenance, intentionally creating a
> loss of availability)
> 3) restarting servera (at this point, HA on servera is not restarted, no
> heartbeat, no resource, etc. --> ????)
> 4) restarting serverb (at this point HA, heartbeat is restarted on both
> nodes and the cluster works correctly again.)
>
> It is strange that after point 3) HA is not restarted and the resources
> are not made available on servera even if serverb is not yet available. As
> soon as one node is up it should start the resources. I have also noted
> that restarting manually heartbeat after point 3) works. Only the
> automatic restart of heartbeat does not work. Of course heartbeat is
> configured to start automatically.
>
> Is this a normal behavior of HA v1 ?

I'd say yes, but you should check logs. Normally, a node from a
two-node cluster should not start resources unless it either
hears from the other node or makes sure that it is down by
fencing it using stonith. The question is if your stonith setup
works.

Thanks,

Dejan

> Here's ha.cf :
> debugfile /var/log/ha-debug
> logfile /var/log/ha-log
> keepalive 1
> deadtime 10
> warntime 2
> initdead 30
> udpport 694
> ucast eth1 192.168.2.2
> auto_failback on
> stonith_host servera.mydomain.com external/xen serverb.mydomain.com
> xen-stonith[at]10.32.2.31
> stonith_host serverb.mydomain.com external/xen servera.mydomain.com
> xen-stonith[at]10.32.2.30
> node servera.mydomain.com
> node serverb.mydomain.com
>
> And here's haresources :
>
> servera.mydomain.com IPaddr::10.32.2.29/26/eth0 drbddisk::r0 iscsitarget
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA[at]lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA[at]lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


Jonathan.Ilroy at smals

Jul 3, 2009, 3:08 AM

Post #3 of 3 (249 views)
Permalink
Re: maintenance on 2 nodes in a ha linux cluster [In reply to]

Hi,

There is nothing in the logs (/var/log/ha-{log,debug}), it looks like
heartbeat is not started at boot. I've managed to make it run at startup
but I had to disable drbd in /etc/rc2.d/ which gives other issues (drbd
cannot be started anymore by heartbeat, so I think drbd should not be
disabled as other services started by heartbeart following some posts in
other forums - please confirm). my xen-stonith agent is not called when
servera is restarted. maybe drbd is preventing heartbeat to start or run
correctly? that's what is see in ps (no heartbeat process)

1256 ? S 0:00 \_ /bin/bash /etc/rc2.d/S70drbd start
1286 ? S 0:00 \_ /sbin/drbdadm wait-con-int
1287 ? S 0:00 \_ /sbin/drbdsetup /dev/drbd0
wait-connect --degr-wfc-timeout=12

Any idea ?

Jonathan



I'd say yes, but you should check logs. Normally, a node from a two-node
cluster should not start resources unless it either hears from the other
node or makes sure that it is down by fencing it using stonith. The
question is if your stonith setup works.

Thanks,

Dejan

_______________________________________________
Linux-HA mailing list
Linux-HA[at]lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.