Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

DRBD + OCFS2 - Split-Brain detected but unresolved

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


cjosh at silvercube

Apr 17, 2012, 8:06 AM

Post #1 of 6 (843 views)
Permalink
DRBD + OCFS2 - Split-Brain detected but unresolved

Hello,

I am currently testing dual-master setup with DRBD+OCFS2.
Finally I managed to get it working well on kernel 2.6.39.4, DRBD version
8.3.10 (userland version: 8.4.1) and OCFS2 version 1.5.0.

I had some troubles with broken replication, and sometimes I see that
automatic recovery sometimes works and sometimes does not. What's strange,
is that this still are tests, and actually when one server is fully
functional, second one has no processess that even touch the synchronized
partition.

In dmesg on the active server it looks like this:

[707152.209885] block drbd0: Handshake successful: Agreed network protocol version 96
[707152.209895] block drbd0: conn( WFConnection -> WFReportParams )
[707152.210068] block drbd0: Starting asender thread (from drbd0_receiver [1096])
[707152.210341] block drbd0: data-integrity-alg: <not-used>
[707152.210352] block drbd0: max BIO size = 130560
[707152.210359] block drbd0: drbd_sync_handshake:
[707152.210363] block drbd0: self 8631CEC3370B5C31:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:21 flags:0
[707152.210368] block drbd0: peer 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:0 flags:0
[707152.210371] block drbd0: uuid_compare()=100 by rule 90
[707152.210377] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
[707152.212439] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
[707152.212442] block drbd0: Split-Brain detected but unresolved, dropping connection!
[707152.212445] block drbd0: helper command: /sbin/drbdadm split-brain minor-0
[707152.214134] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
[707152.214137] block drbd0: conn( WFReportParams -> Disconnecting )
[707152.214141] block drbd0: error receiving ReportState, l: 4!
[707152.214150] block drbd0: asender terminated
[707152.214154] block drbd0: Terminating drbd0_asender
[707152.214177] block drbd0: Connection closed
[707152.214180] block drbd0: conn( Disconnecting -> StandAlone )
[707152.214188] block drbd0: receiver terminated
[707152.214190] block drbd0: Terminating drbd0_receiver

Is there any help for this situation? I don't understand why the case
isn't solved, since second server doesn't write to drbd0, sometimes even
partition wasn't mounted (I can't be 100% sure, but it seems so).

I would be greatful if you could give me some hint how to make this
configuration stable, without sacrificing data on one of nodes (now in
order to recover I have to set second node to slave). Any ideas what is
wrong in my setup?

P.S. Any suggestions how to measure real performance (read/write/copy) of
DRBD+OCFS2? UnixBench gives crazy results (read performance about 10% of
local filesystem)...

Best regards,
--
Jacek Osiecki
josiecki [at] silvercube

Silvercube s.c.
ul. Makuszynskiego 4
31-752 Kraków
+48 (12) 684 21 00


david at davidcoulson

Apr 17, 2012, 9:08 AM

Post #2 of 6 (808 views)
Permalink
Re: DRBD + OCFS2 - Split-Brain detected but unresolved [In reply to]

http://www.drbd.org/users-guide/s-resolve-split-brain.html



On Apr 17, 2012, at 11:06 AM, Jacek Osiecki wrote:

> Hello,
>
> I am currently testing dual-master setup with DRBD+OCFS2.
> Finally I managed to get it working well on kernel 2.6.39.4, DRBD version 8.3.10 (userland version: 8.4.1) and OCFS2 version 1.5.0.
>
> I had some troubles with broken replication, and sometimes I see that
> automatic recovery sometimes works and sometimes does not. What's strange, is that this still are tests, and actually when one server is fully functional, second one has no processess that even touch the synchronized partition.
>
> In dmesg on the active server it looks like this:
>
> [707152.209885] block drbd0: Handshake successful: Agreed network protocol version 96
> [707152.209895] block drbd0: conn( WFConnection -> WFReportParams )
> [707152.210068] block drbd0: Starting asender thread (from drbd0_receiver [1096])
> [707152.210341] block drbd0: data-integrity-alg: <not-used>
> [707152.210352] block drbd0: max BIO size = 130560
> [707152.210359] block drbd0: drbd_sync_handshake:
> [707152.210363] block drbd0: self 8631CEC3370B5C31:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:21 flags:0
> [707152.210368] block drbd0: peer 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5 bits:0 flags:0
> [707152.210371] block drbd0: uuid_compare()=100 by rule 90
> [707152.210377] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
> [707152.212439] block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
> [707152.212442] block drbd0: Split-Brain detected but unresolved, dropping connection!
> [707152.212445] block drbd0: helper command: /sbin/drbdadm split-brain minor-0
> [707152.214134] block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
> [707152.214137] block drbd0: conn( WFReportParams -> Disconnecting )
> [707152.214141] block drbd0: error receiving ReportState, l: 4!
> [707152.214150] block drbd0: asender terminated
> [707152.214154] block drbd0: Terminating drbd0_asender
> [707152.214177] block drbd0: Connection closed
> [707152.214180] block drbd0: conn( Disconnecting -> StandAlone )
> [707152.214188] block drbd0: receiver terminated
> [707152.214190] block drbd0: Terminating drbd0_receiver
>
> Is there any help for this situation? I don't understand why the case isn't solved, since second server doesn't write to drbd0, sometimes even partition wasn't mounted (I can't be 100% sure, but it seems so).
>
> I would be greatful if you could give me some hint how to make this configuration stable, without sacrificing data on one of nodes (now in order to recover I have to set second node to slave). Any ideas what is wrong in my setup?
>
> P.S. Any suggestions how to measure real performance (read/write/copy) of DRBD+OCFS2? UnixBench gives crazy results (read performance about 10% of local filesystem)...
>
> Best regards,
> --
> Jacek Osiecki
> josiecki [at] silvercube
>
> Silvercube s.c.
> ul. Makuszynskiego 4
> 31-752 Kraków
> +48 (12) 684 21 00_______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


jsmith at argotec

Apr 17, 2012, 9:22 AM

Post #3 of 6 (817 views)
Permalink
Re: DRBD + OCFS2 - Split-Brain detected but unresolved [In reply to]

----- Original Message -----
> From: "David Coulson" <david [at] davidcoulson>
> To: "Jacek Osiecki" <cjosh [at] silvercube>
> Cc: drbd-user [at] lists
> Sent: Tuesday, April 17, 2012 12:08:47 PM
> Subject: Re: [DRBD-user] DRBD + OCFS2 - Split-Brain detected but unresolved
>
> http://www.drbd.org/users-guide/s-resolve-split-brain.html
>
>
>
> On Apr 17, 2012, at 11:06 AM, Jacek Osiecki wrote:
>
> > Hello,
> >
> > I am currently testing dual-master setup with DRBD+OCFS2.
> > Finally I managed to get it working well on kernel 2.6.39.4, DRBD
> > version 8.3.10 (userland version: 8.4.1) and OCFS2 version 1.5.0.
> >

I could be wrong but I believe it is bad practice to have userland version != kernel version of DRBD


> > I had some troubles with broken replication, and sometimes I see
> > that
> > automatic recovery sometimes works and sometimes does not. What's
> > strange, is that this still are tests, and actually when one
> > server is fully functional, second one has no processess that even
> > touch the synchronized partition.
> >
> > In dmesg on the active server it looks like this:
> >
> > [707152.209885] block drbd0: Handshake successful: Agreed network
> > protocol version 96
> > [707152.209895] block drbd0: conn( WFConnection -> WFReportParams )
> > [707152.210068] block drbd0: Starting asender thread (from
> > drbd0_receiver [1096])
> > [707152.210341] block drbd0: data-integrity-alg: <not-used>
> > [707152.210352] block drbd0: max BIO size = 130560
> > [707152.210359] block drbd0: drbd_sync_handshake:
> > [707152.210363] block drbd0: self
> > 8631CEC3370B5C31:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5
> > bits:21 flags:0
> > [707152.210368] block drbd0: peer
> > 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5
> > bits:0 flags:0
> > [707152.210371] block drbd0: uuid_compare()=100 by rule 90
> > [707152.210377] block drbd0: helper command: /sbin/drbdadm
> > initial-split-brain minor-0
> > [707152.212439] block drbd0: helper command: /sbin/drbdadm
> > initial-split-brain minor-0 exit code 0 (0x0)
> > [707152.212442] block drbd0: Split-Brain detected but unresolved,
> > dropping connection!
> > [707152.212445] block drbd0: helper command: /sbin/drbdadm
> > split-brain minor-0
> > [707152.214134] block drbd0: helper command: /sbin/drbdadm
> > split-brain minor-0 exit code 0 (0x0)
> > [707152.214137] block drbd0: conn( WFReportParams -> Disconnecting
> > )
> > [707152.214141] block drbd0: error receiving ReportState, l: 4!
> > [707152.214150] block drbd0: asender terminated
> > [707152.214154] block drbd0: Terminating drbd0_asender
> > [707152.214177] block drbd0: Connection closed
> > [707152.214180] block drbd0: conn( Disconnecting -> StandAlone )
> > [707152.214188] block drbd0: receiver terminated
> > [707152.214190] block drbd0: Terminating drbd0_receiver
> >
> > Is there any help for this situation? I don't understand why the
> > case isn't solved, since second server doesn't write to drbd0,
> > sometimes even partition wasn't mounted (I can't be 100% sure, but
> > it seems so).
> >
> > I would be greatful if you could give me some hint how to make this
> > configuration stable, without sacrificing data on one of nodes
> > (now in order to recover I have to set second node to slave). Any
> > ideas what is wrong in my setup?
> >
> > P.S. Any suggestions how to measure real performance
> > (read/write/copy) of DRBD+OCFS2? UnixBench gives crazy results
> > (read performance about 10% of local filesystem)...

For DRBD here's a start:
http://www.drbd.org/users-guide-legacy/p-performance.html

> >
> > Best regards,
> > --
> > Jacek Osiecki
> > josiecki [at] silvercube
> >
> > Silvercube s.c.
> > ul. Makuszynskiego 4
> > 31-752 Kraków
> > +48 (12) 684 21 00_______________________________________________
> > drbd-user mailing list
> > drbd-user [at] lists
> > http://lists.linbit.com/mailman/listinfo/drbd-user
>
> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user
>
>
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Apr 17, 2012, 9:31 AM

Post #4 of 6 (804 views)
Permalink
Re: DRBD + OCFS2 - Split-Brain detected but unresolved [In reply to]

Hi,

On 04/17/2012 05:06 PM, Jacek Osiecki wrote:
> automatic recovery sometimes works and sometimes does
> not.

we seem to be lacking your drbd config. How is automatic split brain
recovery configured?
I get the feeling it's not. What split-brain situations have you
perceived as being automatically solved?

> 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5
> bits:0 flags:0

This looks fine - the peer has set 0 bits, so it's probably indeed
unchanged.

> why the case isn't solved, since second server doesn't write to drbd0,
> sometimes even partition wasn't mounted (I can't be 100% sure, but it
> seems so).

A policy of discard-zero-changes could solve this for you, but only if
configured thus.

> I would be greatful if you could give me some hint how to make this
> configuration stable, without sacrificing data on one of nodes (now in
> order to recover I have to set second node to slave). Any ideas what is
> wrong in my setup?

Your config would be ultra helpful :-)

> P.S. Any suggestions how to measure real performance (read/write/copy)
> of DRBD+OCFS2? UnixBench gives crazy results (read performance about 10%
> of local filesystem)...

Is this crazy? I wouldn't know. But bear in mind that stat can be an
expensive operation on a cluster file system vs. a regular old fs.

HTH,
Felix
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


cjosh at silvercube

Apr 17, 2012, 12:11 PM

Post #5 of 6 (849 views)
Permalink
Re: DRBD + OCFS2 - Split-Brain detected but unresolved [In reply to]

On Tue, 17 Apr 2012 18:31:09 +0200, Felix Frank wrote:

> On 04/17/2012 05:06 PM, Jacek Osiecki wrote:
>> automatic recovery sometimes works and sometimes does
>> not.

> we seem to be lacking your drbd config.

Right, my bad :)

> How is automatic split brain recovery configured?

Probably it isn't - here's the config:

global {usage-count yes;}

common {
handlers {
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh;
/usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ;
reboot -f";
local-io-error "/usr/lib/drbd/notify-io-error.sh;
/usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger
; halt -f";
}
disk { on-io-error detach; }
syncer {rate 100M;}
}

and the resource config:

resource home
{
protocol C;
meta-disk internal;
device /dev/drbd0;
disk /dev/md4;
net {
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
startup { become-primary-on both; }
on mike { address 176.xx.xx.xx:7789; }
on november { address 176.yy.yy.yy:7789; }
}

> I get the feeling it's not. What split-brain situations have you
> perceived as being automatically solved?

Something like this:

[287856.619503] block drbd0: Handshake successful: Agreed network
protocol version 96
[287856.619512] block drbd0: conn( WFConnection -> WFReportParams )
[287856.619682] block drbd0: Starting asender thread (from
drbd0_receiver [24712])
[287856.619885] block drbd0: data-integrity-alg: <not-used>
[287856.619967] block drbd0: max BIO size = 130560
[287856.619978] block drbd0: drbd_sync_handshake:
[287856.619982] block drbd0: self
18D97D7348BC1031:232CE4A32F2915DB:B873B3F48F57A893:B872B3F48F57A893
bits:50 flags:0
[287856.619987] block drbd0: peer
8359D2DF4D7761E0:232CE4A32F2915DB:B873B3F48F57A893:B872B3F48F57A893
bits:3072 flags:2
[287856.619992] block drbd0: uuid_compare()=100 by rule 90
[287856.619995] block drbd0: helper command: /sbin/drbdadm
initial-split-brain minor-0
[287856.622133] block drbd0: helper command: /sbin/drbdadm
initial-split-brain minor-0 exit code 0 (0x0)
[287856.622136] block drbd0: Split-Brain detected, 1 primaries,
automatically solved. Sync from this node
[287856.622141] block drbd0: peer( Unknown -> Secondary ) conn(
WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
[287856.639285] block drbd0: peer( Secondary -> Primary )
[287856.986857] block drbd0: helper command: /sbin/drbdadm
before-resync-source minor-0
[287856.988873] block drbd0: helper command: /sbin/drbdadm
before-resync-source minor-0 exit code 0 (0x0)
[287856.988879] block drbd0: conn( WFBitMapS -> SyncSource ) pdsk(
Consistent -> Inconsistent )
[287856.988884] block drbd0: Began resync as SyncSource (will sync
12484 KB [3121 bits set]).
[287856.988895] block drbd0: updated sync UUID
18D97D7348BC1031:232DE4A32F2915DB:232CE4A32F2915DB:B873B3F48F57A893
[287857.202264] block drbd0: Resync done (total 1 sec; paused 0 sec;
12484 K/sec)
[287857.202268] block drbd0: updated UUIDs
18D97D7348BC1031:0000000000000000:232DE4A32F2915DB:232CE4A32F2915DB
[287857.202272] block drbd0: conn( SyncSource -> Connected ) pdsk(
Inconsistent -> UpToDate )
[287857.347396] block drbd0: bitmap WRITE of 4793 pages took 29 jiffies
[287857.419057] block drbd0: 0 KB (0 bits) marked out-of-sync by on
disk bit-map.

but now I see that those were probably split-brains after secondary
node being rebooted
when I've been testing a lot automatic set-up of drbd after reboot. Am
I right?


>> 73E08E8E06754D97:A9BC3587FC5AA879:0BF8587A4ABA37B5:0BF7587A4ABA37B5
>> bits:0 flags:0
> This looks fine - the peer has set 0 bits, so it's probably indeed
> unchanged.
>> why the case isn't solved, since second server doesn't write to
>> drbd0,
>> sometimes even partition wasn't mounted (I can't be 100% sure, but
>> it
>> seems so).
> A policy of discard-zero-changes could solve this for you, but only
> if
> configured thus.

Seems that my config is lacking this.

My planis to use DRBD+OCFS2 for a HA configuration, with two machines
behind
hardware load-balancer. So far I've been modifying filesystem on one
machine
only. I'm wondering how to handle the situation, where nodes can't see
each
other but are still available through the internet (that's possible,
for
distant locations. Are there any mechanisms that would be capable of
synchronizing the nodes (when node-node communication is up again) on
filesystem
level? I mean, that sometimes even though both filesystems are modified
-
the changes don't cause any conflicts...

Is anyone using such a configuration? What policies are you using?

>> P.S. Any suggestions how to measure real performance
>> (read/write/copy)
>> of DRBD+OCFS2? UnixBench gives crazy results (read performance about
>> 10%
>> of local filesystem)...
> Is this crazy? I wouldn't know. But bear in mind that stat can be an
> expensive operation on a cluster file system vs. a regular old fs.

Here are the results from UnixBench, where I compared:
- local ext3 filesystem
- drbd+ocfs2 in master-master cluster :)
- NFS from NAS provided by OVH hosting

Results in KBps, for copy/read/write. I even didn't dig the exact
meaning
of UnixBench parameters or its methodology, rather wanted to compare
raw
values in similar circumstances:

+-----------------------+-----------+----------------+------------------+
|X bufsize,Y maxblocks |ext3(local)| (drbd+ocfs2) | NFS (ovh-nas)
|
+-----------------------+-----------+----------------+------------------+
| CP 1024 buf 2000 mxbl | 1001513.5| 329691.5 (33%)| 8439.9
(0.8%)|
| CP 256 buf 500 mxbl | 289354.4| 83344.5 (29%)| 7545.5
(2.6%)|
| RD 1024 buf 2000 mxbl | 16683047.3| 1627301.6 (10%)| 16026036.4 (
96%)|
| RD 256 buf 500 mxbl | 4737836.5| 413126.7 ( 9%)| 4509106.6 (
95%)|
| RD 4096 buf 8000 mxbl | 35705631.9| 6872806.6 (19%)| 34967996.7 (
97%)|
| WR 256 buf 500 mxbl | 315172.2| 87545.4 (28%)| 8711.3
(2.8%)|
| WR 4096 buf 8000 mxbl | 3522086.9| 1290255.5 (37%)| 10991.6
(0.3%)|
+-----------------------+-----------+----------------+------------------+

I wrote "crazy" since 10% seems to be quite a low value, especially
when
comparing to copy/write, which seem to be running at 33% of local fs
speed.
Now I realize, that read speed is still much higher than write/copy
speed.
However - could someone verify those values? I just realized that
UnixBench
results are hard to believe and seem to be muuch to high :)

Greetings,
--
Jacek Osiecki
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Apr 18, 2012, 1:08 AM

Post #6 of 6 (816 views)
Permalink
Re: DRBD + OCFS2 - Split-Brain detected but unresolved [In reply to]

Hi,

On 04/17/2012 09:11 PM, Jacek Osiecki wrote:
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> after-sb-2pri disconnect;

alright.

> [287856.622136] block drbd0: Split-Brain
> detected, 1 primaries, automatically solved. Sync from this node

Acknowledged - your automatic resolution works fine.

>> A policy of discard-zero-changes could solve this for you, but only if
>> configured thus.
>
> Seems that my config is lacking this.

It's not, your config is quite fine. (Although Linbit has suggested
"consesus" for the 1pri case in the past, which certainly appears safer
- if for any reason your one primary is the one that has old data, you
loose everything since then with discard-secondary).

Note that there is no safe sb-2pri setting to solve the split-brain
automatically. You really do not *want* DRBD to try and fix itself in
this situation.

In a dual-primary setup, a sufficiently bad network hiccup will cause
this split brain situation. That's why you want stonith to make sure you
don't end up with actually diverged datasets.

Long story short - if you have two primaries that disagree about
history, you want to handle this manually. As far as I know, the number
of bits that are set in the quicksync bitmap is a helpful clue that
tells you which node has had more changes written.

> Are there any mechanisms that would be capable of
> synchronizing the nodes (when node-node communication is up again) on
> filesystem level?

Tools like unison or rsync spring to mind. Still, even if you find a
sane way of doing a merge with those, there will be a lot of block-level
syncing to do *after* the merge.

Cheers,
Felix
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.