Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

STRANGE ISSUE - disk partition deleted after resync

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


roberto.fastec at gmail

Jul 22, 2010, 9:03 AM

Post #1 of 5 (642 views)
Permalink
STRANGE ISSUE - disk partition deleted after resync

I'm doing my first tests with drbd.
My configuration is two pcs with two hdd each.
One drive is xenserver the second is drbd dedicated. On the drbd dedicated
drive, I created /dev/sdb1 which in drbd.conf is assigned to drbd0.

Well, because it is a test environment, I messed up the things a bit and so
I had to erase drbd0 (which still is not clear to me which is the correct
procedure, but I did it) and, to be sure, I also deleted and recreated
/dev/sdb1.

The idea was to start over with the creation of drbd0 resource.

The first issue I had was about one error that was exiting the command *drbdadm
create-md drbd0*. Googleing I've found the solution was to (I'm not so
strong with dd) issue this command *dd if=/dev/zero bs=1M count=1
of=/dev/sdb; sync*, looks like that first 1MB data are moved (?), anyway
this worked.

Then *drbdadm create-md drbd0* worked again and I finished the sequence
# drbdadm create-md drbd0 #Create device metadata
# drbdadm attach drbd0 #Attach to backing device
# drbdadm syncer drbd0 #Set synchronization parameters
# drbdadm connect drbd0 #Connect to peer

last I launched the primary command and started the sync
# drbdadm -- --overwrite-data-of-peer primary drbd0

and the sync worked fine, with a final average speed of 81MB/sec, in 35
minutes the 200GB volume was synced, I also copied and pasted to my notes
the cat /proc/drbd command result while syncing
[root [at] xenserver- dev]# cat /proc/drbd
version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by
root [at] localhost, 2010-07-17 10:04:02
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:935040 nr:0 dw:0 dr:11435392 al:0 bm:697 lo:145 pe:0 ua:145 ap:0 ep:1
wo:b oos:183876844
[>...................] sync'ed: 5.9% (179564/190732)M
finish: 0:34:18 speed: 89,308 (76,232) K/sec

when finished, it was late night and so I turned off the secondary, last the
primary and I went at home.

Now I'm here to go on with my tests. Turned on the primary, turned on the
secondary and with my big surprise, I've found this

[root [at] xenserver- ~]# drbd-overview
0:drbd0 Unconfigured . . . .

so I also lauched cat /proc/drbd

[root [at] xenserver- ~]# cat /proc/drbd
version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by
root [at] localhost, 2010-07-17 10:04:02
0: cs:Unconfigured

so I issued

[root [at] xenserver- ~]# drbdadm up drbd0
Can not open device '/dev/sdb1': No such file or directory
Command 'drbdsetup 0 disk /dev/sdb1 /dev/sdb1 internal --set-defaults
--create-device --max-bio-bvecs=1 --on-io-error=detach' terminated with exit
code 20
drbdadm attach drbd0: exited with code 20

and because of the message, I checked with fdisk -l

[root [at] xenserver- ~]# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 523 4194304 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 523 1045 4194304 83 Linux
/dev/sda3 1045 60801 479995393 8e Linux LVM

Disk /dev/sdb: 1000.2 GB, 1000215724032 bytes
255 heads, 63 sectors/track, 121602 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Yes it is, /dev/sdb1 is disappeared... on both computers...

Below you find the drbd.conf.

My question are:
- which is the correct mode to erase one drbd resource ?
- is it possible what described above?
- to be able to do such "disaster" (I'm thinking about a production
environment), what could be the error/mistake?
- with the following configuration file, both the drbd are starting in
secondary mode, maybe this is wanted by drbd concept, but I can't understand
it very well.
I mean, when everything works fine, I have one primary and one secondary.
If I do a clean shutdown and I shut down first of all the secondary, next I
shut down the primary, why starting up first the primary and last the
secondary both of them are secondary? Is it the only solution
*become-primary-on
server-1* statement? If yes... I have another issue.. this didn't worked...
and powercycling the two pcs in the correct order, again returned a
secondary/secondary situation.
- Is it correct to use the server name in the following statement
*become-primary-on
server-1* ?
- Which are the downsides of usignthis setting in the drbd.conf:
*become-primary-on
server-1* ?

Thank you for any tip and help, follows the conf file
Robert


drbd.conf

# You can find an example in /usr/share/doc/drbd.../drbd.conf.example

#include "drbd.d/global_common.conf";
#include "drbd.d/*.res";

global {
usage-count yes;
}

common {
syncer {
rate 1G;
verify-alg md5;
csums-alg md5;
}
}

resource drbd0 {
protocol C;

startup {
#become-primary-on xenserver-2;
}

net {
cram-hmac-alg md5;
shared-secret "ColdWater";
sndbuf-size 0;
rcvbuf-size 0;
data-integrity-alg md5;
}

disk {
max-bio-bvecs 1;
on-io-error detach;
}

on server-1 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.2:7789;
meta-disk internal;
}

on server-2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.3:7789;
meta-disk internal;
}

handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
}

}


dbarker at visioncomm

Jul 22, 2010, 9:20 AM

Post #2 of 5 (604 views)
Permalink
Re: STRANGE ISSUE - disk partition deleted after resync [In reply to]

I'm not sure why you wanted to "erase" /dev/sdb1, but to do so, you'd use dd
with an outfile of /dev/sdb1, not /dev/sdb. You cleared the first megabyte
of the disk, and that includes the partition table. Maybe you wanted to
create the metadata and it said metadata was already there. So, use drbdadm
-- --force create-md <res>. dd not needed and you can't specify the
incorrect device. If the drbd configuration specification is correct, the
proper device will get the metadata.



If the disk does not contain data, there is no point in synchronizing it.
Just mark it empty and connect (drbdadm -- --clear-bitmap new-current-uuid
<resource>). It's instant vs 30 minutes. Details below.



Thirdly, if the disk is dedicated to drbd, why a partition at all, just use
the whole disk, (/dev/sdb vs /dev/sdb1). You just waste space for the
partition table and confuse the issue.



Dan in Atlanta



From: drbd-user-bounces [at] lists
[mailto:drbd-user-bounces [at] lists] On Behalf Of Roberto Fastec
Sent: Thursday, July 22, 2010 12:04 PM
To: drbd-user [at] lists
Subject: [DRBD-user] STRANGE ISSUE - disk partition deleted after resync



I'm doing my first tests with drbd.
My configuration is two pcs with two hdd each.
One drive is xenserver the second is drbd dedicated. On the drbd dedicated
drive, I created /dev/sdb1 which in drbd.conf is assigned to drbd0.

Well, because it is a test environment, I messed up the things a bit and so
I had to erase drbd0 (which still is not clear to me which is the correct
procedure, but I did it) and, to be sure, I also deleted and recreated
/dev/sdb1.

The idea was to start over with the creation of drbd0 resource.

The first issue I had was about one error that was exiting the command
drbdadm create-md drbd0. Googleing I've found the solution was to (I'm not
so strong with dd) issue this command dd if=/dev/zero bs=1M count=1
of=/dev/sdb; sync, looks like that first 1MB data are moved (?), anyway this
worked.

Then drbdadm create-md drbd0 worked again and I finished the sequence
# drbdadm create-md drbd0 #Create device metadata
# drbdadm attach drbd0 #Attach to backing device
# drbdadm syncer drbd0 #Set synchronization parameters
# drbdadm connect drbd0 #Connect to peer

last I launched the primary command and started the sync
# drbdadm -- --overwrite-data-of-peer primary drbd0

and the sync worked fine, with a final average speed of 81MB/sec, in 35
minutes the 200GB volume was synced, I also copied and pasted to my notes
the cat /proc/drbd command result while syncing
[root [at] xenserver- dev]# cat /proc/drbd
version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by
root [at] localhost, 2010-07-17 10:04:02
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:935040 nr:0 dw:0 dr:11435392 al:0 bm:697 lo:145 pe:0 ua:145 ap:0 ep:1
wo:b oos:183876844
[>...................] sync'ed: 5.9% (179564/190732)M
finish: 0:34:18 speed: 89,308 (76,232) K/sec

when finished, it was late night and so I turned off the secondary, last the
primary and I went at home.

Now I'm here to go on with my tests. Turned on the primary, turned on the
secondary and with my big surprise, I've found this

[root [at] xenserver- ~]# drbd-overview
0:drbd0 Unconfigured . . . .

so I also lauched cat /proc/drbd

[root [at] xenserver- ~]# cat /proc/drbd
version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by
root [at] localhost, 2010-07-17 10:04:02
0: cs:Unconfigured

so I issued

[root [at] xenserver- ~]# drbdadm up drbd0
Can not open device '/dev/sdb1': No such file or directory
Command 'drbdsetup 0 disk /dev/sdb1 /dev/sdb1 internal --set-defaults
--create-device --max-bio-bvecs=1 --on-io-error=detach' terminated with exit
code 20
drbdadm attach drbd0: exited with code 20

and because of the message, I checked with fdisk -l

[root [at] xenserver- ~]# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 523 4194304 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 523 1045 4194304 83 Linux
/dev/sda3 1045 60801 479995393 8e Linux LVM

Disk /dev/sdb: 1000.2 GB, 1000215724032 bytes
255 heads, 63 sectors/track, 121602 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Yes it is, /dev/sdb1 is disappeared... on both computers...

Below you find the drbd.conf.

My question are:
- which is the correct mode to erase one drbd resource ?
- is it possible what described above?
- to be able to do such "disaster" (I'm thinking about a production
environment), what could be the error/mistake?
- with the following configuration file, both the drbd are starting in
secondary mode, maybe this is wanted by drbd concept, but I can't understand
it very well.
I mean, when everything works fine, I have one primary and one secondary.
If I do a clean shutdown and I shut down first of all the secondary, next I
shut down the primary, why starting up first the primary and last the
secondary both of them are secondary? Is it the only solution
become-primary-on server-1 statement? If yes... I have another issue.. this
didn't worked... and powercycling the two pcs in the correct order, again
returned a secondary/secondary situation.
- Is it correct to use the server name in the following statement
become-primary-on server-1 ?
- Which are the downsides of usignthis setting in the drbd.conf:
become-primary-on server-1 ?

Thank you for any tip and help, follows the conf file
Robert


drbd.conf

# You can find an example in /usr/share/doc/drbd.../drbd.conf.example

#include "drbd.d/global_common.conf";
#include "drbd.d/*.res";

global {
usage-count yes;
}

common {
syncer {
rate 1G;
verify-alg md5;
csums-alg md5;
}
}

resource drbd0 {
protocol C;

startup {
#become-primary-on xenserver-2;
}

net {
cram-hmac-alg md5;
shared-secret "ColdWater";
sndbuf-size 0;
rcvbuf-size 0;
data-integrity-alg md5;
}

disk {
max-bio-bvecs 1;
on-io-error detach;
}

on server-1 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.2:7789;
meta-disk internal;
}

on server-2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.3:7789;
meta-disk internal;
}

handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
}

}


dbarker at visioncomm

Jul 22, 2010, 9:23 AM

Post #3 of 5 (602 views)
Permalink
Re: STRANGE ISSUE - disk partition deleted after resync [In reply to]

Oops! I forgot to show the entire procedure for new, blank disks (from
http://www.drbd.org/users-guide/re-drbdsetup.html)



New Blank Disk:

===============

#On both nodes, initialize meta data and configure the device.

drbdadm -- --force create-md <res>



#They need to do the initial handshake, so they know their sizes.

drbdadm up <res>



#They are now Connected Secondary/Secondary Inconsistent/Inconsistent.
Generate a new current-uuid and clear the dirty bitmap.

drbdadm -- --clear-bitmap new-current-uuid <res>



#They are now Connected Secondary/Secondary UpToDate/UpToDate.

drbdadm primary <res>



Dan in Atlanta





From: drbd-user-bounces [at] lists
[mailto:drbd-user-bounces [at] lists] On Behalf Of Roberto Fastec
Sent: Thursday, July 22, 2010 12:04 PM
To: drbd-user [at] lists
Subject: [DRBD-user] STRANGE ISSUE - disk partition deleted after resync



I'm doing my first tests with drbd.
My configuration is two pcs with two hdd each.
One drive is xenserver the second is drbd dedicated. On the drbd dedicated
drive, I created /dev/sdb1 which in drbd.conf is assigned to drbd0.

Well, because it is a test environment, I messed up the things a bit and so
I had to erase drbd0 (which still is not clear to me which is the correct
procedure, but I did it) and, to be sure, I also deleted and recreated
/dev/sdb1.

The idea was to start over with the creation of drbd0 resource.

The first issue I had was about one error that was exiting the command
drbdadm create-md drbd0. Googleing I've found the solution was to (I'm not
so strong with dd) issue this command dd if=/dev/zero bs=1M count=1
of=/dev/sdb; sync, looks like that first 1MB data are moved (?), anyway this
worked.

Then drbdadm create-md drbd0 worked again and I finished the sequence
# drbdadm create-md drbd0 #Create device metadata
# drbdadm attach drbd0 #Attach to backing device
# drbdadm syncer drbd0 #Set synchronization parameters
# drbdadm connect drbd0 #Connect to peer

last I launched the primary command and started the sync
# drbdadm -- --overwrite-data-of-peer primary drbd0

and the sync worked fine, with a final average speed of 81MB/sec, in 35
minutes the 200GB volume was synced, I also copied and pasted to my notes
the cat /proc/drbd command result while syncing
[root [at] xenserver- dev]# cat /proc/drbd
version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by
root [at] localhost, 2010-07-17 10:04:02
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:935040 nr:0 dw:0 dr:11435392 al:0 bm:697 lo:145 pe:0 ua:145 ap:0 ep:1
wo:b oos:183876844
[>...................] sync'ed: 5.9% (179564/190732)M
finish: 0:34:18 speed: 89,308 (76,232) K/sec

when finished, it was late night and so I turned off the secondary, last the
primary and I went at home.

Now I'm here to go on with my tests. Turned on the primary, turned on the
secondary and with my big surprise, I've found this

[root [at] xenserver- ~]# drbd-overview
0:drbd0 Unconfigured . . . .

so I also lauched cat /proc/drbd

[root [at] xenserver- ~]# cat /proc/drbd
version: 8.3.8.1 (api:88/proto:86-94)
GIT-hash: 0d8589fcc32c874df57c930ca1691399b55ec893 build by
root [at] localhost, 2010-07-17 10:04:02
0: cs:Unconfigured

so I issued

[root [at] xenserver- ~]# drbdadm up drbd0
Can not open device '/dev/sdb1': No such file or directory
Command 'drbdsetup 0 disk /dev/sdb1 /dev/sdb1 internal --set-defaults
--create-device --max-bio-bvecs=1 --on-io-error=detach' terminated with exit
code 20
drbdadm attach drbd0: exited with code 20

and because of the message, I checked with fdisk -l

[root [at] xenserver- ~]# fdisk -l

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 * 1 523 4194304 83 Linux
Partition 1 does not end on cylinder boundary.
/dev/sda2 523 1045 4194304 83 Linux
/dev/sda3 1045 60801 479995393 8e Linux LVM

Disk /dev/sdb: 1000.2 GB, 1000215724032 bytes
255 heads, 63 sectors/track, 121602 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/sdb doesn't contain a valid partition table

Yes it is, /dev/sdb1 is disappeared... on both computers...

Below you find the drbd.conf.

My question are:
- which is the correct mode to erase one drbd resource ?
- is it possible what described above?
- to be able to do such "disaster" (I'm thinking about a production
environment), what could be the error/mistake?
- with the following configuration file, both the drbd are starting in
secondary mode, maybe this is wanted by drbd concept, but I can't understand
it very well.
I mean, when everything works fine, I have one primary and one secondary.
If I do a clean shutdown and I shut down first of all the secondary, next I
shut down the primary, why starting up first the primary and last the
secondary both of them are secondary? Is it the only solution
become-primary-on server-1 statement? If yes... I have another issue.. this
didn't worked... and powercycling the two pcs in the correct order, again
returned a secondary/secondary situation.
- Is it correct to use the server name in the following statement
become-primary-on server-1 ?
- Which are the downsides of usignthis setting in the drbd.conf:
become-primary-on server-1 ?

Thank you for any tip and help, follows the conf file
Robert


drbd.conf

# You can find an example in /usr/share/doc/drbd.../drbd.conf.example

#include "drbd.d/global_common.conf";
#include "drbd.d/*.res";

global {
usage-count yes;
}

common {
syncer {
rate 1G;
verify-alg md5;
csums-alg md5;
}
}

resource drbd0 {
protocol C;

startup {
#become-primary-on xenserver-2;
}

net {
cram-hmac-alg md5;
shared-secret "ColdWater";
sndbuf-size 0;
rcvbuf-size 0;
data-integrity-alg md5;
}

disk {
max-bio-bvecs 1;
on-io-error detach;
}

on server-1 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.2:7789;
meta-disk internal;
}

on server-2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.1.1.3:7789;
meta-disk internal;
}

handlers {
split-brain "/usr/lib/drbd/notify-split-brain.sh root";
}

}


roberto.fastec at gmail

Jul 22, 2010, 9:42 AM

Post #4 of 5 (604 views)
Permalink
Re: STRANGE ISSUE - disk partition deleted after resync [In reply to]

Dear Dan, thank you for your tip.

Ok, thinking about the sequence, I've deleted the partition table... ;-P

Better don't do tests too late in the night...

But, what is that synced? Because it synced. Have they been my 200GBs
/dev/sdb1 partition or some kind of ghost?

To answer to your question, please note that /dev/sdb in the primary server
is a 1000GB disk while /dev/sdb on the secondary server it is a 1500GB disk.
This is a size mismatch and for this reason I supposed the correct thing to
do was to create one partition of the same size on each drive.
If they can mismatch, well this is interesting, but for the moment, while
testing, I prefer to wait for a 200GB sync time instead than 1000GB.

Thank you if you tip me also about these issues.

Roberto


roberto.fastec at gmail

Jul 22, 2010, 11:50 PM

Post #5 of 5 (597 views)
Permalink
Re: STRANGE ISSUE - disk partition deleted after resync [In reply to]

[Solved]

Thank to Dan I focused that I self deleted my partition table. Still unknown which 200GB were synced.

Additionally Dan gave me some tips that now I'm going to share with other newbies like me :-)

- if you dedicate drbd a whole disk on each server.
1) don't care about their size, drbd will consider the smallest of the two (but looking at dmesg, I've seen it complaining about the size difference and does some sync/check)
2) it is not needed to partition them. If u don't partition them, this could be better, so drdb will not deal with partition table

- if you messed up the things and u want start over with such configuration and for some reason delete the drbd resource
1) If the disks are still empty (mine are, because I'm doing test with drdb itself), you can skip waiting for a full sync and do as follow

New Blank Disk:
===============

#On both nodes, initialize meta data and configure the device.

drbdadm -- --force create-md <res>

#They need to do the initial handshake, so they know their sizes.

drbdadm up <res>

#They are now Connected Secondary/Secondary Inconsistent/Inconsistent.
Generate a new current-uuid and clear the dirty bitmap.

drbdadm -- --clear-bitmap new-current-uuid <res>

#They are now Connected Secondary/Secondary UpToDate/UpToDate.

drbdadm primary <res>

# finished

- if you have a simple primary/secondary drbd setup
1) drbd.conf file must contain the become-primary-on <mainservername> statement
2) Turn-on-sequence, <mainservername> will become primary only if its OS will boot after the the secondary node OS boot. Otherwise they will be both secondary/secondary no matter the above statement.

Kind regards

R.
Le mail ti raggiungono ovunque con BlackBerry® from Vodafone!
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.