Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

DRBD Split-Brain auto recover

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


trm.nagios at gmail

Nov 26, 2011, 1:49 PM

Post #1 of 6 (1361 views)
Permalink
DRBD Split-Brain auto recover

Dear List,

I have one HA NFS setup with DRBD. Primary is NFS1 server & secondary is
NFS2 server.

Please help me out to configure the auto recovery from split-brain.

Below is my config & package details.


Packages:
kmod-drbd83-8.3.8-1.el5.centos
drbd83-8.3.8-1.el5.centos

/etc/drbd.conf [ same one both the box]

common { syncer { rate 100M; al-extents 257; } }
resource main {
protocol C;
handlers { pri-on-incon-degr "halt -f"; }
disk { on-io-error detach; }
startup { degr-wfc-timeout 60; wfc-timeout 60; }

on NFS1 {
address 10.20.137.8:7789;
device /dev/drbd0;
disk /dev/sdc;
meta-disk internal;
}
on NFS2 {
address 10.20.137.9:7789;
device /dev/drbd0;
disk /dev/sdc;
meta-disk internal;
}
}


symack at gmail

Nov 27, 2011, 8:16 AM

Post #2 of 6 (1325 views)
Permalink
Re: DRBD Split-Brain auto recover [In reply to]

I could be wrong, but topics as important as a disk replicator's
ability to automatically recover
from split brain has been covered multiple times on it's list. Not to
mention the thourough
documentation.

http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html
http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-automatic-split-brain-recovery-configuration
http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-split-brain-notification

How about it......

Nick from Toronto.
- Show quoted text -



On Sat, Nov 26, 2011 at 4:49 PM, trm asn <trm.nagios [at] gmail> wrote:
> Dear List,
>
> I have one HA NFS setup with DRBD. Primary is NFS1 server & secondary is
> NFS2 server.
>
> Please help me out to configure the auto recovery from split-brain.
>
> Below is my config & package details.
>
>
> Packages:
> kmod-drbd83-8.3.8-1.el5.centos
> drbd83-8.3.8-1.el5.centos
>
> /etc/drbd.conf [ same one both the box]
>
> common { syncer { rate 100M; al-extents 257; } }
> resource main {
> protocol C;
> handlers { pri-on-incon-degr "halt -f"; }
> disk { on-io-error detach; }
> startup { degr-wfc-timeout 60; wfc-timeout 60; }
>
> on NFS1 {
> address 10.20.137.8:7789;
> device /dev/drbd0;
> disk /dev/sdc;
> meta-disk internal;
> }
> on NFS2 {
> address 10.20.137.9:7789;
> device /dev/drbd0;
> disk /dev/sdc;
> meta-disk internal;
> }
> }
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


linux at alteeve

Nov 27, 2011, 9:08 AM

Post #3 of 6 (1315 views)
Permalink
Re: DRBD Split-Brain auto recover [In reply to]

On 11/26/2011 04:49 PM, trm asn wrote:
> Dear List,
>
> I have one HA NFS setup with DRBD. Primary is NFS1 server & secondary is
> NFS2 server.
>
> Please help me out to configure the auto recovery from split-brain.

You can't safely recover from split-brain automatically. Consider;

Node A saved a Fedora ISO, ~3.5GB written.
Node B saved a hours worth of credit card transactions, ~1MB written.

Which node has the more valuable data?

The best you can do is configure and test fencing so that you can avoid
split brain conditions in the first place.

--
Digimer
E-Mail: digimer [at] alteeve
Freenode handle: digimer
Papers and Projects: http://alteeve.com
Node Assassin: http://nodeassassin.org
"omg my singularity battery is dead again.
stupid hawking radiation." - epitron
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


trm.nagios at gmail

Dec 7, 2011, 11:14 PM

Post #4 of 6 (1251 views)
Permalink
Re: DRBD Split-Brain auto recover [In reply to]

On Sun, Nov 27, 2011 at 9:46 PM, Nick Khamis <symack [at] gmail> wrote:

> I could be wrong, but topics as important as a disk replicator's
> ability to automatically recover
> from split brain has been covered multiple times on it's list. Not to
> mention the thourough
> documentation.
>
> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html
>
> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-automatic-split-brain-recovery-configuration
>
> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-split-brain-notification
>
> How about it......
>
> Nick from Toronto.
> - Show quoted text -
>
>
>
> On Sat, Nov 26, 2011 at 4:49 PM, trm asn <trm.nagios [at] gmail> wrote:
> > Dear List,
> >
> > I have one HA NFS setup with DRBD. Primary is NFS1 server & secondary is
> > NFS2 server.
> >
> > Please help me out to configure the auto recovery from split-brain.
> >
> > Below is my config & package details.
> >
> >
> > Packages:
> > kmod-drbd83-8.3.8-1.el5.centos
> > drbd83-8.3.8-1.el5.centos
> >
> > /etc/drbd.conf [ same one both the box]
> >
> > common { syncer { rate 100M; al-extents 257; } }
> > resource main {
> > protocol C;
> > handlers { pri-on-incon-degr "halt -f"; }
> > disk { on-io-error detach; }
> > startup { degr-wfc-timeout 60; wfc-timeout 60; }
> >
> > on NFS1 {
> > address 10.20.137.8:7789;
> > device /dev/drbd0;
> > disk /dev/sdc;
> > meta-disk internal;
> > }
> > on NFS2 {
> > address 10.20.137.9:7789;
> > device /dev/drbd0;
> > disk /dev/sdc;
> > meta-disk internal;
> > }
> > }
> >
> >
>
>

Below I am getting one packet loss warning message. And due to that it's
becoming StandAlone status on both the servers. Is there any mechanism to
increase the number of packet drop count in DRBD .



Dec 7 19:23:13 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[1782:1784]
Dec 7 19:27:21 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[1906:1908]
Dec 7 19:28:27 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[1939:1941]
Dec 7 19:38:49 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2250:2252]
Dec 7 19:40:01 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2286:2288]
Dec 7 19:41:31 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2331:2333]
Dec 7 19:46:01 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2466:2468]
Dec 7 19:46:47 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2489:2491]
Dec 7 19:46:59 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2495:2497]
Dec 7 19:47:09 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[2500:2502]
Dec 8 06:52:48 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[90:92]
Dec 8 06:52:54 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[93:95]
Dec 8 06:59:14 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for [nfs2]
[283:285]


Thanks & Regards,
Tarak Ranjan


andreas at hastexo

Dec 9, 2011, 5:01 AM

Post #5 of 6 (1274 views)
Permalink
Re: DRBD Split-Brain auto recover [In reply to]

On 12/08/2011 08:14 AM, trm asn wrote:
>
>
> On Sun, Nov 27, 2011 at 9:46 PM, Nick Khamis <symack [at] gmail
> <mailto:symack [at] gmail>> wrote:
>
> I could be wrong, but topics as important as a disk replicator's
> ability to automatically recover
> from split brain has been covered multiple times on it's list. Not to
> mention the thourough
> documentation.
>
> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html
> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-automatic-split-brain-recovery-configuration
> http://www.drbd.org/users-guide/s-configure-split-brain-behavior.html#s-split-brain-notification
>
> How about it......
>
> Nick from Toronto.
> - Show quoted text -
>
>
>
> On Sat, Nov 26, 2011 at 4:49 PM, trm asn <trm.nagios [at] gmail
> <mailto:trm.nagios [at] gmail>> wrote:
> > Dear List,
> >
> > I have one HA NFS setup with DRBD. Primary is NFS1 server &
> secondary is
> > NFS2 server.
> >
> > Please help me out to configure the auto recovery from split-brain.
> >
> > Below is my config & package details.
> >
> >
> > Packages:
> > kmod-drbd83-8.3.8-1.el5.centos
> > drbd83-8.3.8-1.el5.centos
> >
> > /etc/drbd.conf [ same one both the box]
> >
> > common { syncer { rate 100M; al-extents 257; } }
> > resource main {
> > protocol C;
> > handlers { pri-on-incon-degr "halt -f"; }
> > disk { on-io-error detach; }
> > startup { degr-wfc-timeout 60; wfc-timeout 60; }
> >
> > on NFS1 {
> > address 10.20.137.8:7789 <http://10.20.137.8:7789>;
> > device /dev/drbd0;
> > disk /dev/sdc;
> > meta-disk internal;
> > }
> > on NFS2 {
> > address 10.20.137.9:7789 <http://10.20.137.9:7789>;
> > device /dev/drbd0;
> > disk /dev/sdc;
> > meta-disk internal;
> > }
> > }
> >
> >
>
>
>
> Below I am getting one packet loss warning message. And due to that it's
> becoming StandAlone status on both the servers. Is there any mechanism
> to increase the number of packet drop count in DRBD .

That has nothing to do with DRBD, these are messages from Heartbeats
messaging layer ... flaky network?

Regards,
Andreas

--
Need help with DRBD & Pacemaker?
http://www.hastexo.com/now

>
>
>
> Dec 7 19:23:13 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [1782:1784]
> Dec 7 19:27:21 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [1906:1908]
> Dec 7 19:28:27 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [1939:1941]
> Dec 7 19:38:49 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2250:2252]
> Dec 7 19:40:01 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2286:2288]
> Dec 7 19:41:31 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2331:2333]
> Dec 7 19:46:01 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2466:2468]
> Dec 7 19:46:47 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2489:2491]
> Dec 7 19:46:59 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2495:2497]
> Dec 7 19:47:09 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [2500:2502]
> Dec 8 06:52:48 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [90:92]
> Dec 8 06:52:54 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [93:95]
> Dec 8 06:59:14 NFS1 heartbeat: [12280]: WARN: 1 lost packet(s) for
> [nfs2] [283:285]
>
>
> Thanks & Regards,
> Tarak Ranjan
>
>
>
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
Attachments: signature.asc (0.28 KB)


lars.ellenberg at linbit

Dec 9, 2011, 7:02 AM

Post #6 of 6 (1237 views)
Permalink
Re: DRBD Split-Brain auto recover [In reply to]

On Fri, Dec 09, 2011 at 02:01:33PM +0100, Andreas Kurz wrote:
> On 12/08/2011 08:14 AM, trm asn wrote:

> > Below I am getting one packet loss warning message. And due to that it's
> > becoming StandAlone status on both the servers. Is there any mechanism
> > to increase the number of packet drop count in DRBD .
>
> That has nothing to do with DRBD, these are messages from Heartbeats
> messaging layer ... flaky network?

Slight reordering of messages can always happen.
Heartbeat tries to be nice and not warn about it, if it receives the
"missing" messages within a (configurable, short) timeout.

Versions since [.I don't know exactly when, also depends somewhat on
platform, compile time and run time environment] until 3.0.5 would
unfortunately set this timeout to zero, so would complain about each
reordering as if it really was a lost packet.

I recommend to upgrade to latest heartbeat.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.