Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

IO error when mounting device

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


lawrence at junkmail

Feb 17, 2012, 1:15 AM

Post #1 of 2 (271 views)
Permalink
IO error when mounting device

Hi List,

I used DRBD in dual primary mode with ocfs2 for my load balancing web
server cluster. I didn't encounter any errors during setup and when I
put the web site on the DRBD device on the primary node, it replicated
without any errors. It has been running fine during the week of testing
but this morning when we updated code located on the DRBD device we
noticed it was not replicating to the secondary node.
the DRBD device was mounted on both nodes but /proc/drbd output this:

*version: 8.3.7 (api:88/proto:86-91)
GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by
root [at] web01, 2012-01-10 09:54:40
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----
ns:0 nr:0 dw:5960937 dr:5047235 al:1490 bm:1363 lo:0 pe:0 ua:0 ap:0
ep:1 wo:b oos:8840028*


I restarted drbd and ocfs2 but still the result was the same. Next I
rebooted the misbehaving node and noticed when it came back up that the
DRBD device was no longer mounted.

Trying to mount the device manually returns this error:
*mount /dev/drbd0
mount.ocfs2: I/O error on channel while opening device /dev/drbd0*


A tail of the log file shows nothing but an earlier entry shows this:

*Feb 17 10:47:54 web02 kernel: [ 13.531600] block drbd0: disk(
Attaching -> UpToDate )
Feb 17 10:47:54 web02 kernel: [ 13.535865] block drbd0: conn(
StandAlone -> Unconnected )
Feb 17 10:47:54 web02 kernel: [ 13.535889] block drbd0: Starting
receiver thread (from drbd0_worker [1484])
Feb 17 10:47:54 web02 kernel: [ 13.535998] block drbd0: receiver
(re)started
Feb 17 10:47:54 web02 kernel: [ 13.536006] block drbd0: conn(
Unconnected -> WFConnection )


*This is my r1.res file:

*===============================================================
resource r1 {
meta-disk internal;
device /dev/drbd0;
disk /dev/vol01/docroot;

syncer { rate 1000M; }
net {
allow-two-primaries;
after-sb-0pri discard-zero-changes;
after-sb-1pri discard-secondary;
after-sb-2pri disconnect;
}
startup { become-primary-on both; }

on web01.junkmail.co.za { address 10.0.0.111:7789; }
on web02.junkmail.co.za { address 10.0.0.112:7789; }
}*
*===============================================================*



Here is /etc/ocfs2/cluster.conf:

===============================================================
*cluster:
node_count = 2
name = jbm_web

node:
ip_port = 7777
ip_address = 10.0.0.111
number = 1
name = web01
cluster = jbm_web

node:
ip_port = 7777
ip_address = 10.0.0.112
number = 2
name = web02
cluster = jbm_web
================================================================



*Any help/ideas much appreciated - the pressure is on here.

Thanks
*



*
*-- *
*Lawrence Strydom*
*Linux System Administrator*
Junk Mail Publishing Group
Tel : (+27) 12 342 3840 Ext 2811
Fax : 0000
Email : lawrence [at] junkmail



*NOW FREE . NOW LIVE . BUY & SELL IN MINUTES*

Join us on Facebook <http://www.facebook.com/junkmailclassifieds> Place
Free Ad
<http://www.junkmail.co.za/post-free-ad?utm_source=inhouse&utm_medium=email&utm_campaign=EmailSignature>
Browse Classifieds
<http://www.junkmail.co.za/r-southafrica-classifieds-QZQYRgnQX0005?utm_source=inhouse&utm_medium=email&utm_campaign=EmailSignature>

*Link to MAP:
http://maps.google.co.za/maps?f=q&source=s_q&hl=en&geocode=&q=1312+Pretorius+Str,+Hatfield,+Pretoria&sll=-
<http://maps.google.co.za/maps?f=q&source=s_q&hl=en&geocode=&q=1312+Pretorius+Str,+Hatfield,+Pretoria&sll=->*

Please visit:
<http://maps.google.co.za/maps?f=q&source=s_q&hl=en&geocode=&q=1312+Pretorius+Str,+Hatfield,+Pretoria&sll=->www.junkmail.co.za
<http://www.junkmail.co.za> www.jobmail.co.za <http://www.jobmail.co.za>
www.capeads.co.za <http://www.capeads.co.za> www.lovemail.co.za
<http://www.lovemail.co.za>
www.saautomart.co.za <http://www.saautomart.co.za>
www.truckandtrailer.co.za <http://www.truckandtrailer.co.za>
www.buyandsell4x4.co.za <http://www.buyandsell4x4.co.za>
www.bikeandquadmart.co.za <http://www.bikeandquadmart.co.za>
Legal Disclaimer: This e-mail and its attachments may contain
information that is confidential and that may be
subject to legal privilege and copyright. If you are not the intended
recipient you may not peruse, use, disclose,
distribute, copy or retain this message. If you have received this
message in error, please notify the sender
immediately by e-mail, facsimile or telephone and return and thereafter
destroy the original message. Please note
that e-mails are subject to viruses, data corruption, delay,
interception and unauthorised amendment, and that the
sender does not accept liability for any damages that may be incurred as
a result of communication by e-mail. No
employee or intermediary is authorised to conclude a binding agreement
on behalf of the sender by e-mail without
express written confirmation by a duly authorised representative of the
sender. By transmitting this e-mail message
over the Internet the sender does not intend to allow the contents
hereof to become part of the public domain, and
the confidential nature of the contents shall not be altered or
diminished from by such transmission.
Attachments: sig_jm.jpg (2.57 KB)


lars.ellenberg at linbit

Feb 20, 2012, 12:04 PM

Post #2 of 2 (250 views)
Permalink
Re: IO error when mounting device [In reply to]

On Fri, Feb 17, 2012 at 11:15:45AM +0200, Lawrence Strydom wrote:
> Hi List,
>
> I used DRBD in dual primary mode with ocfs2 for my load balancing

If you use dual primary DRBD with cluster file systems,
you *MUST* have *working* and *tested* fencing in place.

That is DRBD fencing policy has to be "resource-and-stonith",
and the "fence-peer" handler is supposed to trigger, or at least wait
for, a node-fencing as well.

This is necessary because otherwise, as soon as the replication
connection breaks, the data on both nodes could diverge.
(aka split brain, or even just resource-internal split brain).

I also recommend to upgrade to 8.3.12.
With 8.3.7, you'd probably still have to configure after-split-brain
auto recovery policies, even if you got the fencing right.

> web server cluster. I didn't encounter any errors during setup and
> when I put the web site on the DRBD device on the primary node, it
> replicated without any errors. It has been running fine during the
> week of testing but this morning when we updated code located on the
> DRBD device we noticed it was not replicating to the secondary node.
> the DRBD device was mounted on both nodes but /proc/drbd output this:
>
> *version: 8.3.7 (api:88/proto:86-91)
> GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by
> root [at] web01, 2012-01-10 09:54:40
> 0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----
> ns:0 nr:0 dw:5960937 dr:5047235 al:1490 bm:1363 lo:0 pe:0 ua:0
> ap:0 ep:1 wo:b oos:8840028*
>
>
> I restarted drbd and ocfs2 but still the result was the same. Next I
> rebooted the misbehaving node and noticed when it came back up that
> the DRBD device was no longer mounted.
>
> Trying to mount the device manually returns this error:
> *mount /dev/drbd0
> mount.ocfs2: I/O error on channel while opening device /dev/drbd0*
>
>
> A tail of the log file shows nothing but an earlier entry shows this:
>
> *Feb 17 10:47:54 web02 kernel: [ 13.531600] block drbd0: disk(
> Attaching -> UpToDate )
> Feb 17 10:47:54 web02 kernel: [ 13.535865] block drbd0: conn(
> StandAlone -> Unconnected )
> Feb 17 10:47:54 web02 kernel: [ 13.535889] block drbd0: Starting
> receiver thread (from drbd0_worker [1484])
> Feb 17 10:47:54 web02 kernel: [ 13.535998] block drbd0: receiver
> (re)started
> Feb 17 10:47:54 web02 kernel: [ 13.536006] block drbd0: conn(
> Unconnected -> WFConnection )
>
>
> *This is my r1.res file:
>
> *===============================================================
> resource r1 {
> meta-disk internal;
> device /dev/drbd0;
> disk /dev/vol01/docroot;
>
> syncer { rate 1000M; }
> net {
> allow-two-primaries;
> after-sb-0pri discard-zero-changes;
> after-sb-1pri discard-secondary;
> after-sb-2pri disconnect;
> }
> startup { become-primary-on both; }
>
> on web01.junkmail.co.za { address 10.0.0.111:7789; }
> on web02.junkmail.co.za { address 10.0.0.112:7789; }
> }*
> *===============================================================*
>
>
>
> Here is /etc/ocfs2/cluster.conf:
>
> ===============================================================
> *cluster:
> node_count = 2
> name = jbm_web
>
> node:
> ip_port = 7777
> ip_address = 10.0.0.111
> number = 1
> name = web01
> cluster = jbm_web
>
> node:
> ip_port = 7777
> ip_address = 10.0.0.112
> number = 2
> name = web02
> cluster = jbm_web
> ================================================================
>
>
>
> *Any help/ideas much appreciated - the pressure is on here.
>
> Thanks
> *
>
>
>
> *
> *-- *
> *Lawrence Strydom*
> *Linux System Administrator*
> Junk Mail Publishing Group
> Tel : (+27) 12 342 3840 Ext 2811
> Fax : 0000
> Email : lawrence [at] junkmail
>
>
>
> *NOW FREE . NOW LIVE . BUY & SELL IN MINUTES*
>
> Join us on Facebook <http://www.facebook.com/junkmailclassifieds>
> Place Free Ad <http://www.junkmail.co.za/post-free-ad?utm_source=inhouse&utm_medium=email&utm_campaign=EmailSignature>
> Browse Classifieds <http://www.junkmail.co.za/r-southafrica-classifieds-QZQYRgnQX0005?utm_source=inhouse&utm_medium=email&utm_campaign=EmailSignature>
>
> *Link to MAP: http://maps.google.co.za/maps?f=q&source=s_q&hl=en&geocode=&q=1312+Pretorius+Str,+Hatfield,+Pretoria&sll=- <http://maps.google.co.za/maps?f=q&source=s_q&hl=en&geocode=&q=1312+Pretorius+Str,+Hatfield,+Pretoria&sll=->*
>
> Please visit: <http://maps.google.co.za/maps?f=q&source=s_q&hl=en&geocode=&q=1312+Pretorius+Str,+Hatfield,+Pretoria&sll=->www.junkmail.co.za
> <http://www.junkmail.co.za> www.jobmail.co.za
> <http://www.jobmail.co.za> www.capeads.co.za
> <http://www.capeads.co.za> www.lovemail.co.za
> <http://www.lovemail.co.za>
> www.saautomart.co.za <http://www.saautomart.co.za>
> www.truckandtrailer.co.za <http://www.truckandtrailer.co.za>
> www.buyandsell4x4.co.za <http://www.buyandsell4x4.co.za>
> www.bikeandquadmart.co.za <http://www.bikeandquadmart.co.za>
> Legal Disclaimer: This e-mail and its attachments may contain
> information that is confidential and that may be
> subject to legal privilege and copyright. If you are not the
> intended recipient you may not peruse, use, disclose,
> distribute, copy or retain this message. If you have received this
> message in error, please notify the sender
> immediately by e-mail, facsimile or telephone and return and
> thereafter destroy the original message. Please note
> that e-mails are subject to viruses, data corruption, delay,
> interception and unauthorised amendment, and that the
> sender does not accept liability for any damages that may be
> incurred as a result of communication by e-mail. No
> employee or intermediary is authorised to conclude a binding
> agreement on behalf of the sender by e-mail without
> express written confirmation by a duly authorised representative of
> the sender. By transmitting this e-mail message
> over the Internet the sender does not intend to allow the contents
> hereof to become part of the public domain, and
> the confidential nature of the contents shall not be altered or
> diminished from by such transmission.

> _______________________________________________
> drbd-user mailing list
> drbd-user [at] lists
> http://lists.linbit.com/mailman/listinfo/drbd-user


--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com

DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
__
please don't Cc me, but send to list -- I'm subscribed
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.