Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

Disk Corruption = DRBD Failure?

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


charles at fixflyer

Oct 11, 2011, 8:09 AM

Post #1 of 9 (802 views)
Permalink
Disk Corruption = DRBD Failure?

Hi,

I have been reading the docs and still seem to be unclear as to some things-

Assume I have a two node setup with DRBD in Primary/Primary with Xen
writing to /dev/drbd0 on node1. I use Primary/Primary for live migration
and in my Xen DomU configuration file I use phy: and not drbd: handler.

Now, what happens if the disk on node1 begins to fail and the blocks
where /dev/drbd0 resides are corrupted while we continue to write to
this- will these bad/corrupted blocks be replicated to node2?

Example aside, in short, I am wondering if a failing disk on a node will
result in DRBD replicating bad block data to the secondary node. I know
there a place in the docs describing integrity checker using the kernels
crypt algo's (like md5) so maybe thats an option to prevent it?

In either case, is there any way to prevent bad block data from node 1
being replicated to node 2?

--
Regards,
Chuck Kozler
/Lead Infrastructure & Systems Administrator/
---
*Office*: 1-646-290-6267 | *Mobile*: 1-646-385-3684
FIX Flyer

Notice to Recipient: This e-mail is meant only for the intended
recipient(s) of the transmission, and contains confidential information
which is proprietary
to FIX Flyer LLC. Any unauthorized use, copying, distribution, or
dissemination is strictly prohibited. All rights to this information is
reserved by FIX Flyer LLC.
If you are not the intended recipient, please contact the sender by
reply e-mail and please delete this e-mail from your system and destroy
any copies


ff at mpexnet

Oct 11, 2011, 11:57 PM

Post #2 of 9 (790 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

Hi,

On 10/11/2011 05:09 PM, Charles Kozler wrote:
> Now, what happens if the disk on node1 begins to fail and the blocks
> where /dev/drbd0 resides are corrupted while we continue to write to
> this- will these bad/corrupted blocks be replicated to node2?

I don't think that's possible.

If your disk gets corrupted, it may write bad data, true. However, DRBD
doesn't re-read the written data from disk before replicating. The DRBD
driver replicates whatever your local kernel *wants* your disk to write.

Corruption of your replica can happen if
1. your link is faulty and the transmitted data is altered or
2. the drive in your Secondary becomes corrupted

There may be other scenarios.

HTH,
Felix
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


florian at hastexo

Oct 12, 2011, 12:04 AM

Post #3 of 9 (783 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

On 2011-10-11 17:09, Charles Kozler wrote:
> Hi,
>
> I have been reading the docs and still seem to be unclear as to some things-
>
> Assume I have a two node setup with DRBD in Primary/Primary with Xen
> writing to /dev/drbd0 on node1. I use Primary/Primary for live migration
> and in my Xen DomU configuration file I use phy: and not drbd: handler.
>
> Now, what happens if the disk on node1 begins to fail and the blocks
> where /dev/drbd0 resides are corrupted while we continue to write to
> this- will these bad/corrupted blocks be replicated to node2?

If the underlying _disk_ fails in weird ways and that is why you get
corruption, then the corruption occurs below the DRBD level and there's
no corruption for DRBD to replicate.

If however you have one of your Xen domUs writing garbage to that device
(so the corruption occurs in a layer above DRBD), then of course DRBD
will happily replicate that corruption.

> Example aside, in short, I am wondering if a failing disk on a node will
> result in DRBD replicating bad block data to the secondary node. I know
> there a place in the docs describing integrity checker using the kernels
> crypt algo's (like md5) so maybe thats an option to prevent it?

Nope, that will only prevent corruption that may occur *within* DRBD due
to a fishy network layer, or bit flips on your PCI bus, or broken
checksum offloading on your NICs.

For preventing corruption in the disk I/O layer, DRBD would have to
support DIF/DIX, which it currently doesn't do (very few applications do).

> In either case, is there any way to prevent bad block data from node 1
> being replicated to node 2?

Corruption rooted in the network stack, yes -- use data-integrity-alg.

Corruption rooted in your Xen domU, nope.

For corruption rooted in the I/O layer, you can't prevent the
replication from happening but you can detect the corruption after the
fact -- use verify-alg and run device verification.

Hope this helps,
Florian

--
Need help with DRBD?
http://www.hastexo.com/now
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


charles at fixflyer

Oct 12, 2011, 11:00 AM

Post #4 of 9 (763 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

This was 100% spot on the answer I was looking for- thanks guys!

Also, do any white papers exist on how DRBD works on the inside? From
what you told me it looks like its

Write to DRBD Block Device -> Write to TCP buffer -> Write to host disks

I thought it was

Write DRBD Block Device -> Write to disk -> Write to TCP Buffer -> Write
to host disks (like a push method almost)

Which is why I wanted to know about disk corruption but from what it
seems like is that I should be more concerned about corruption in the
network stack, right?


Regards,
Chuck Kozler
/Lead Infrastructure & Systems Administrator/
---
*Office*: 1-646-290-6267 | *Mobile*: 1-646-385-3684
FIX Flyer

Notice to Recipient: This e-mail is meant only for the intended
recipient(s) of the transmission, and contains confidential information
which is proprietary
to FIX Flyer LLC. Any unauthorized use, copying, distribution, or
dissemination is strictly prohibited. All rights to this information is
reserved by FIX Flyer LLC.
If you are not the intended recipient, please contact the sender by
reply e-mail and please delete this e-mail from your system and destroy
any copies

On 10/12/2011 3:04 AM, Florian Haas wrote:
> On 2011-10-11 17:09, Charles Kozler wrote:
>> Hi,
>>
>> I have been reading the docs and still seem to be unclear as to some things-
>>
>> Assume I have a two node setup with DRBD in Primary/Primary with Xen
>> writing to /dev/drbd0 on node1. I use Primary/Primary for live migration
>> and in my Xen DomU configuration file I use phy: and not drbd: handler.
>>
>> Now, what happens if the disk on node1 begins to fail and the blocks
>> where /dev/drbd0 resides are corrupted while we continue to write to
>> this- will these bad/corrupted blocks be replicated to node2?
> If the underlying _disk_ fails in weird ways and that is why you get
> corruption, then the corruption occurs below the DRBD level and there's
> no corruption for DRBD to replicate.
>
> If however you have one of your Xen domUs writing garbage to that device
> (so the corruption occurs in a layer above DRBD), then of course DRBD
> will happily replicate that corruption.
>
>> Example aside, in short, I am wondering if a failing disk on a node will
>> result in DRBD replicating bad block data to the secondary node. I know
>> there a place in the docs describing integrity checker using the kernels
>> crypt algo's (like md5) so maybe thats an option to prevent it?
> Nope, that will only prevent corruption that may occur *within* DRBD due
> to a fishy network layer, or bit flips on your PCI bus, or broken
> checksum offloading on your NICs.
>
> For preventing corruption in the disk I/O layer, DRBD would have to
> support DIF/DIX, which it currently doesn't do (very few applications do).
>
>> In either case, is there any way to prevent bad block data from node 1
>> being replicated to node 2?
> Corruption rooted in the network stack, yes -- use data-integrity-alg.
>
> Corruption rooted in your Xen domU, nope.
>
> For corruption rooted in the I/O layer, you can't prevent the
> replication from happening but you can detect the corruption after the
> fact -- use verify-alg and run device verification.
>
> Hope this helps,
> Florian
>


florian at hastexo

Oct 12, 2011, 11:09 AM

Post #5 of 9 (765 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

On 2011-10-12 20:00, Charles Kozler wrote:
> This was 100% spot on the answer I was looking for- thanks guys!
>
> Also, do any white papers exist on how DRBD works on the inside? From
> what you told me it looks like its

Um, "DRBD Internals" in the DRBD User's Guide? You can also check out
the Publications sections in that same guide; but those are most likely
of interest to kernel developers only. But don't let that scare you,
dive right in if you're so inclined. :)

> Write to DRBD Block Device -> Write to TCP buffer -> Write to host disks
>
> I thought it was
>
> Write DRBD Block Device -> Write to disk -> Write to TCP Buffer -> Write
> to host disks (like a push method almost)

So what's the difference between "Write to disk" and "Write
to host disks" in your model?

The actual pattern is also described in the User's Guide; see the
chapter named "DRBD Fundamentals".

> Which is why I wanted to know about disk corruption but from what it
> seems like is that I should be more concerned about corruption in the
> network stack, right?

No you should be concerned about both, because your users will eat you
alive if you present them broken data, no matter the source of breakage.

Cheers,
Florian

--
Need help with DRBD?
http://www.hastexo.com/now
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


charles at fixflyer

Oct 12, 2011, 11:30 AM

Post #6 of 9 (760 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

I will re-read the DRBD Funadmentals- the way I understood it was
basically if you were writing to node1 it wouldn't put the data through
a TCP socket and would actually just write directly to the block device
and that TCP was usually only used for the actual replicating and data
integrity conversation between the hosts. My understanding now is that
for all hosts included in the resource definition it will put the data
into that socket - including the host you're writing from (eg: if I
wrote to /dev/drbd0 on host1 it will go through the socket to write the
data still to write it to the underlying block device- I had originally
thought it would skip the TCP socket write and write directly to the
block device).

I hope that was clear? It probably doesnt make sense because my original
understanding was wrong :)


Regards,
Chuck Kozler
/Lead Infrastructure & Systems Administrator/
---
*Office*: 1-646-290-6267 | *Mobile*: 1-646-385-3684
FIX Flyer

Notice to Recipient: This e-mail is meant only for the intended
recipient(s) of the transmission, and contains confidential information
which is proprietary
to FIX Flyer LLC. Any unauthorized use, copying, distribution, or
dissemination is strictly prohibited. All rights to this information is
reserved by FIX Flyer LLC.
If you are not the intended recipient, please contact the sender by
reply e-mail and please delete this e-mail from your system and destroy
any copies

On 10/12/2011 2:09 PM, Florian Haas wrote:
> On 2011-10-12 20:00, Charles Kozler wrote:
>> This was 100% spot on the answer I was looking for- thanks guys!
>>
>> Also, do any white papers exist on how DRBD works on the inside? From
>> what you told me it looks like its
> Um, "DRBD Internals" in the DRBD User's Guide? You can also check out
> the Publications sections in that same guide; but those are most likely
> of interest to kernel developers only. But don't let that scare you,
> dive right in if you're so inclined. :)
>
>> Write to DRBD Block Device -> Write to TCP buffer -> Write to host disks
>>
>> I thought it was
>>
>> Write DRBD Block Device -> Write to disk -> Write to TCP Buffer -> Write
>> to host disks (like a push method almost)
> So what's the difference between "Write to disk" and "Write
> to host disks" in your model?
>
> The actual pattern is also described in the User's Guide; see the
> chapter named "DRBD Fundamentals".
>
>> Which is why I wanted to know about disk corruption but from what it
>> seems like is that I should be more concerned about corruption in the
>> network stack, right?
> No you should be concerned about both, because your users will eat you
> alive if you present them broken data, no matter the source of breakage.
>
> Cheers,
> Florian
>


florian at hastexo

Oct 14, 2011, 12:08 AM

Post #7 of 9 (744 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

On 2011-10-12 20:30, Charles Kozler wrote:
> I will re-read the DRBD Funadmentals- the way I understood it was
> basically if you were writing to node1 it wouldn't put the data through
> a TCP socket and would actually just write directly to the block device
> and that TCP was usually only used for the actual replicating and data
> integrity conversation between the hosts. My understanding now is that
> for all hosts included in the resource definition it will put the data
> into that socket - including the host you're writing from (eg: if I
> wrote to /dev/drbd0 on host1 it will go through the socket to write the
> data still to write it to the underlying block device-

Er, no. It won't.

> I had originally
> thought it would skip the TCP socket write and write directly to the
> block device).

For the _local_ write, of course it doesn't go through the TCP socket.
Why should it? That would be braindead. Also, given the documentation,
what makes you think so? I ask because I wrote it, and if there's
anything horribly unclear in there I'd be happy to fix it.

Did you look at the illustration in the Fundamentals chapter?

Florian

--
Need help with High Availability?
http://www.hastexo.com/now
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


charles at fixflyer

Oct 14, 2011, 2:21 AM

Post #8 of 9 (740 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

Haven't read it yet though I will later today.

Having not ready any of the documentation of the underlying processes/workings, all of my understandings were purely based on assumptions from my basic use with DRBD- that said, thank you for all your insight and I will let you know later my understandings :)


Sent from my Verizon Wireless BlackBerry

-----Original Message-----
From: Florian Haas <florian [at] hastexo>
Sender: drbd-user-bounces [at] lists
Date: Fri, 14 Oct 2011 09:08:17
To: <drbd-user [at] lists>
Subject: Re: [DRBD-user] Disk Corruption = DRBD Failure?

On 2011-10-12 20:30, Charles Kozler wrote:
> I will re-read the DRBD Funadmentals- the way I understood it was
> basically if you were writing to node1 it wouldn't put the data through
> a TCP socket and would actually just write directly to the block device
> and that TCP was usually only used for the actual replicating and data
> integrity conversation between the hosts. My understanding now is that
> for all hosts included in the resource definition it will put the data
> into that socket - including the host you're writing from (eg: if I
> wrote to /dev/drbd0 on host1 it will go through the socket to write the
> data still to write it to the underlying block device-

Er, no. It won't.

> I had originally
> thought it would skip the TCP socket write and write directly to the
> block device).

For the _local_ write, of course it doesn't go through the TCP socket.
Why should it? That would be braindead. Also, given the documentation,
what makes you think so? I ask because I wrote it, and if there's
anything horribly unclear in there I'd be happy to fix it.

Did you look at the illustration in the Fundamentals chapter?

Florian

--
Need help with High Availability?
http://www.hastexo.com/now
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


charles at fixflyer

Oct 14, 2011, 5:36 AM

Post #9 of 9 (744 views)
Permalink
Re: Disk Corruption = DRBD Failure? [In reply to]

Hi Florian,

Thanks again for all of your help.

While the diagram makes clear the entire flow of the process, I am
looking for something like a flow chart to depict the order of
operations. For instance, what is the flow of the data and order of
operations when a write occurs to /dev/drbd0 on the primary and how is
it applied on the other node- a write occurs to /dev/drbd0 on node1, it
writes to the real block device on node1, then is put on the socket to
node2, node2 receives data and applies algo check (if applied), data
written to /dev/drbd0 on node2.

The fundamentals page also only gives a brief overview of how it works
at a high level, I am looking to see what actually occurs under the hood
so perhaps I should start looking at the kernel docs that you pointed
out earlier?



Regards,
Chuck Kozler
/Lead Infrastructure & Systems Administrator/
---
*Office*: 1-646-290-6267 | *Mobile*: 1-646-385-3684
FIX Flyer

Notice to Recipient: This e-mail is meant only for the intended
recipient(s) of the transmission, and contains confidential information
which is proprietary
to FIX Flyer LLC. Any unauthorized use, copying, distribution, or
dissemination is strictly prohibited. All rights to this information is
reserved by FIX Flyer LLC.
If you are not the intended recipient, please contact the sender by
reply e-mail and please delete this e-mail from your system and destroy
any copies

On 10/14/2011 5:21 AM, Charles Kozler wrote:
> Haven't read it yet though I will later today.
>
> Having not ready any of the documentation of the underlying processes/workings, all of my understandings were purely based on assumptions from my basic use with DRBD- that said, thank you for all your insight and I will let you know later my understandings :)
>
>
> Sent from my Verizon Wireless BlackBerry
>
> -----Original Message-----
> From: Florian Haas<florian [at] hastexo>
> Sender: drbd-user-bounces [at] lists
> Date: Fri, 14 Oct 2011 09:08:17
> To:<drbd-user [at] lists>
> Subject: Re: [DRBD-user] Disk Corruption = DRBD Failure?
>
> On 2011-10-12 20:30, Charles Kozler wrote:
>> I will re-read the DRBD Funadmentals- the way I understood it was
>> basically if you were writing to node1 it wouldn't put the data through
>> a TCP socket and would actually just write directly to the block device
>> and that TCP was usually only used for the actual replicating and data
>> integrity conversation between the hosts. My understanding now is that
>> for all hosts included in the resource definition it will put the data
>> into that socket - including the host you're writing from (eg: if I
>> wrote to /dev/drbd0 on host1 it will go through the socket to write the
>> data still to write it to the underlying block device-
> Er, no. It won't.
>
>> I had originally
>> thought it would skip the TCP socket write and write directly to the
>> block device).
> For the _local_ write, of course it doesn't go through the TCP socket.
> Why should it? That would be braindead. Also, given the documentation,
> what makes you think so? I ask because I wrote it, and if there's
> anything horribly unclear in there I'd be happy to fix it.
>
> Did you look at the illustration in the Fundamentals chapter?
>
> Florian
>

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.