Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

switch over takes a quite long time

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


pierre.lebrech at laposte

Aug 4, 2009, 3:47 AM

Post #1 of 2 (706 views)
Permalink
switch over takes a quite long time

Hello,

context : 3-node cluster, every node connected, HA services on node1, DRBD version 8.3.2 on linux 2.6.30.

I switch HA services over to node2 with "/usr/lib/heartbeat/hb_standby all" from node1.

It takes a long time to perform : 20 seconds.

I have this in logs (ha-log) :

on node1 :

ResourceManager[7475]: 2009/08/04_11:53:27 info: Running /etc/ha.d/resource.d/drbdupper r0-U start
Filesystem[7799]: 2009/08/04_11:53:47 INFO: Resource is stopped
ResourceManager[7475]: 2009/08/04_11:53:47 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 start

and on node2 :

heartbeat[4836]: 2009/08/04_11:53:26 info: Local standby process completed [all].
heartbeat[4836]: 2009/08/04_11:53:26 info: New standby state: 3
heartbeat[4836]: 2009/08/04_11:53:26 info: Managed go_standby process 12561 exited with return code 0.
heartbeat[4836]: 2009/08/04_11:53:48 WARN: 1 lost packet(s) for [node1] [7702:7704]
heartbeat[4836]: 2009/08/04_11:53:48 info: remote resource transition completed.


Question : why drbdupper takes such a long time to start? Is that normal?

Thanks.






here is the drbd.conf file :

global {
usage-count yes;
}
common {
syncer { rate 10M; }
net {
max-buffers 40000;
}
}

resource r0 {
protocol C;
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
}
startup {
wfc-timeout 0;
degr-wfc-timeout 120;
}
disk {
on-io-error detach;
}
net {
after-sb-0pri disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
rate 90M;
al-extents 128;
csums-alg md5;
}

on node1 {
device /dev/drbd0;
disk /dev/md2;
address 10.0.0.1:7788;
meta-disk /dev/md1 [0];
}
on node2 {
device /dev/drbd0;
disk /dev/md2;
address 10.0.0.2:7788;
meta-disk /dev/md1 [0];
}
}

resource r0-U {
protocol C;

syncer {
csums-alg md5;
rate 5M;
}

stacked-on-top-of r0 {
device /dev/drbd1;
address 192.168.2.15:7788;
}

on node3 {
device /dev/drbd1;
disk /dev/md2;
address 192.168.2.14:7788;
meta-disk internal;
}


}

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


pierre.lebrech at laposte

Aug 5, 2009, 3:10 AM

Post #2 of 2 (618 views)
Permalink
Re: [SOLVED] switch over takes a quite long time [In reply to]

The IP in drbd.conf for node3 was wrong on 2 nodes!

Pierre LEBRECH a écrit :
> Hello,
>
> context : 3-node cluster, every node connected, HA services on node1, DRBD version 8.3.2 on linux 2.6.30.
>
> I switch HA services over to node2 with "/usr/lib/heartbeat/hb_standby all" from node1.
>
> It takes a long time to perform : 20 seconds.
>
> I have this in logs (ha-log) :
>
> on node1 :
>
> ResourceManager[7475]: 2009/08/04_11:53:27 info: Running /etc/ha.d/resource.d/drbdupper r0-U start
> Filesystem[7799]: 2009/08/04_11:53:47 INFO: Resource is stopped
> ResourceManager[7475]: 2009/08/04_11:53:47 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /data ext3 start
>
> and on node2 :
>
> heartbeat[4836]: 2009/08/04_11:53:26 info: Local standby process completed [all].
> heartbeat[4836]: 2009/08/04_11:53:26 info: New standby state: 3
> heartbeat[4836]: 2009/08/04_11:53:26 info: Managed go_standby process 12561 exited with return code 0.
> heartbeat[4836]: 2009/08/04_11:53:48 WARN: 1 lost packet(s) for [node1] [7702:7704]
> heartbeat[4836]: 2009/08/04_11:53:48 info: remote resource transition completed.
>
>
> Question : why drbdupper takes such a long time to start? Is that normal?
>
> Thanks.
>
>
>
>
>
>
> here is the drbd.conf file :
>
> global {
> usage-count yes;
> }
> common {
> syncer { rate 10M; }
> net {
> max-buffers 40000;
> }
> }
>
> resource r0 {
> protocol C;
> handlers {
> pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f";
> pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
> local-io-error "echo o > /proc/sysrq-trigger ; halt -f";
> }
> startup {
> wfc-timeout 0;
> degr-wfc-timeout 120;
> }
> disk {
> on-io-error detach;
> }
> net {
> after-sb-0pri disconnect;
> after-sb-1pri disconnect;
> after-sb-2pri disconnect;
> rr-conflict disconnect;
> }
> syncer {
> rate 90M;
> al-extents 128;
> csums-alg md5;
> }
>
> on node1 {
> device /dev/drbd0;
> disk /dev/md2;
> address 10.0.0.1:7788;
> meta-disk /dev/md1 [0];
> }
> on node2 {
> device /dev/drbd0;
> disk /dev/md2;
> address 10.0.0.2:7788;
> meta-disk /dev/md1 [0];
> }
> }
>
> resource r0-U {
> protocol C;
>
> syncer {
> csums-alg md5;
> rate 5M;
> }
>
> stacked-on-top-of r0 {
> device /dev/drbd1;
> address 192.168.2.15:7788;
> }
>
> on node3 {
> device /dev/drbd1;
> disk /dev/md2;
> address 192.168.2.14:7788;
> meta-disk internal;
> }
>
>
> }
>
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
>

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.