Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

servers out of sync

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


marcel at kraan

May 13, 2012, 6:25 AM

Post #1 of 2 (351 views)
Permalink
servers out of sync

i don't get it synced again.
they are now both stand alone?
i can ping them both.

don't have any options left.

[root [at] kvmstorage drbd.d]# cat /proc/drbd
version: 8.3.12 (api:88/proto:86-96)
GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil [at] Build64R, 2012-04-08 09:36:52
0: cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:412 dr:9926 al:2 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:280

[root [at] kvmstorage drbd.d]# cat /proc/drbd
version: 8.3.12 (api:88/proto:86-96)
GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil [at] Build64R, 2012-04-08 09:36:52
0: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:264




/var/log/messages on 2 servers

[root [at] kvmstorage drbd.d]# service drbd restart
Stopping all DRBD resources: May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( UpToDate -> Failed )
May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( Failed -> Diskless )
May 13 15:14:13 kvmstorage2 kernel: block drbd0: drbd_bm_resize called with capacity == 0
May 13 15:14:13 kvmstorage2 kernel: block drbd0: worker terminated
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Terminating worker thread
May 13 15:14:13 kvmstorage2 kernel: drbd: module cleanup done.
.
Starting DRBD resources: May 13 15:14:13 kvmstorage2 kernel: drbd: initialized. Version: 8.3.12 (api:88/proto:86-96)
May 13 15:14:13 kvmstorage2 kernel: drbd: GIT-hash: e2a8ef4656be026bbae540305fcb998a5991090f build by phil [at] Build64R, 2012-04-08 09:36:52
May 13 15:14:13 kvmstorage2 kernel: drbd: registered as block device major 147
May 13 15:14:13 kvmstorage2 kernel: drbd: minor_table @ 0xffff88020f7257c0
[. d(main) May 13 15:14:13 kvmstorage2 kernel: block drbd0: Starting worker thread (from cqueue [1344])
May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( Diskless -> Attaching )
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Found 6 transactions (34 active extents) in activity log.
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Method to ensure write ordering: barrier
May 13 15:14:13 kvmstorage2 kernel: block drbd0: max BIO size = 131072
May 13 15:14:13 kvmstorage2 kernel: block drbd0: drbd_bm_resize called with capacity == 6920386232
May 13 15:14:13 kvmstorage2 kernel: block drbd0: resync bitmap: bits=865048279 words=13516380 pages=26400
May 13 15:14:13 kvmstorage2 kernel: block drbd0: size = 3300 GB (3460193116 KB)
May 13 15:14:13 kvmstorage2 kernel: block drbd0: bitmap READ of 26400 pages took 198 jiffies
May 13 15:14:13 kvmstorage2 kernel: block drbd0: recounting of set bits took additional 90 jiffies
May 13 15:14:13 kvmstorage2 kernel: block drbd0: 264 KB (66 bits) marked out-of-sync by on disk bit-map.
May 13 15:14:13 kvmstorage2 kernel: block drbd0: disk( Attaching -> UpToDate )
May 13 15:14:13 kvmstorage2 kernel: block drbd0: attached to UUIDs C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99
n(main) May 13 15:14:13 kvmstorage2 kernel: block drbd0: conn( StandAlone -> Unconnected )
May 13 15:14:13 kvmstorage2 kernel: block drbd0: Starting receiver thread (from drbd0_worker [6484])
May 13 15:14:13 kvmstorage2 kernel: block drbd0: receiver (re)started
May 13 15:14:13 kvmstorage2 kernel: block drbd0: conn( Unconnected -> WFConnection )
]May 13 15:14:14 kvmstorage2 kernel: block drbd0: Handshake successful: Agreed network protocol version 96
May 13 15:14:14 kvmstorage2 kernel: block drbd0: conn( WFConnection -> WFReportParams )
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Starting asender thread (from drbd0_receiver [6494])
May 13 15:14:14 kvmstorage2 kernel: block drbd0: data-integrity-alg: <not-used>
May 13 15:14:14 kvmstorage2 kernel: block drbd0: drbd_sync_handshake:
May 13 15:14:14 kvmstorage2 kernel: block drbd0: self C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99 bits:66 flags:0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: peer E33CEADD1FF28EE1:9555562D91EACAC3:A615ADBD6A39BD98:A614ADBD6A39BD99 bits:70 flags:0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: uuid_compare()=100 by rule 90
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0
May 13 15:14:14 kvmstorage2 kernel: block drbd0: meta connection shut down by peer.
May 13 15:14:14 kvmstorage2 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 13 15:14:14 kvmstorage2 kernel: block drbd0: conn( WFReportParams -> Disconnecting )
May 13 15:14:14 kvmstorage2 kernel: block drbd0: error receiving ReportState, l: 4!
May 13 15:14:14 kvmstorage2 kernel: block drbd0: asender terminated
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Terminating asender thread
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Connection closed
May 13 15:14:14 kvmstorage2 kernel: block drbd0: conn( Disconnecting -> StandAlone )
May 13 15:14:14 kvmstorage2 kernel: block drbd0: receiver terminated
May 13 15:14:14 kvmstorage2 kernel: block drbd0: Terminating receiver thread



second server (primary right now)

root [at] kvmstorage drbd.d]# service drbd restart
Stopping all DRBD resources: umount: /datastore: device is busy.
(In some cases useful info about processes that use
the device is found by lsof(8) or fuser(1))
/dev/drbd0: State change failed: (-12) Device is held open by someone
May 13 15:16:22 kvmstorage1 kernel: block drbd0: State change failed: Device is held open by someone
May 13 15:16:22 kvmstorage1 kernel: block drbd0: state = { cs:StandAlone ro:Primary/Unknown ds:UpToDate/DUnknown r----- }
May 13 15:16:22 kvmstorage1 kernel: block drbd0: wanted = { cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown r----- }
ERROR: Module drbd is in use
.
Starting DRBD resources: [. n(main) May 13 15:16:22 kvmstorage1 kernel: block drbd0: conn( StandAlone -> Unconnected )
May 13 15:16:22 kvmstorage1 kernel: block drbd0: Starting receiver thread (from drbd0_worker [1441])
May 13 15:16:22 kvmstorage1 kernel: block drbd0: receiver (re)started
May 13 15:16:22 kvmstorage1 kernel: block drbd0: conn( Unconnected -> WFConnection )
]..........
***************************************************************
DRBD's startup script waits for the peer node(s) to appear.
- In case this node was already a degraded cluster before the
reboot the timeout is 0 seconds. [degr-wfc-timeout]
- If the peer was available before the reboot the timeout will
expire after 0 seconds. [wfc-timeout]
(These values are for resource 'drbd'; 0 sec -> wait forever)
(i had to restart drbd on the second node)
To abort waiting enter 'yes' [ 54]:May 13 15:17:16 kvmstorage1 kernel: block drbd0: Handshake successful: Agreed network protocol version 96
May 13 15:17:16 kvmstorage1 kernel: block drbd0: conn( WFConnection -> WFReportParams )
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Starting asender thread (from drbd0_receiver [7458])
May 13 15:17:16 kvmstorage1 kernel: block drbd0: data-integrity-alg: <not-used>
May 13 15:17:16 kvmstorage1 kernel: block drbd0: drbd_sync_handshake:
May 13 15:17:16 kvmstorage1 kernel: block drbd0: self E33CEADD1FF28EE1:9555562D91EACAC3:A615ADBD6A39BD98:A614ADBD6A39BD99 bits:70 flags:0
May 13 15:17:16 kvmstorage1 kernel: block drbd0: peer C12A485E56F51104:9555562D91EACAC2:A615ADBD6A39BD99:A614ADBD6A39BD99 bits:66 flags:0
May 13 15:17:16 kvmstorage1 kernel: block drbd0: uuid_compare()=100 by rule 90
May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0
May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm initial-split-brain minor-0 exit code 0 (0x0)
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Split-Brain detected but unresolved, dropping connection!
May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0

May 13 15:17:16 kvmstorage1 kernel: block drbd0: helper command: /sbin/drbdadm split-brain minor-0 exit code 0 (0x0)
May 13 15:17:16 kvmstorage1 kernel: block drbd0: conn( WFReportParams -> Disconnecting )
May 13 15:17:16 kvmstorage1 kernel: block drbd0: error receiving ReportState, l: 4!
May 13 15:17:16 kvmstorage1 kernel: block drbd0: asender terminated
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Terminating asender thread
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Connection closed
May 13 15:17:16 kvmstorage1 kernel: block drbd0: conn( Disconnecting -> StandAlone )
May 13 15:17:16 kvmstorage1 kernel: block drbd0: receiver terminated
May 13 15:17:16 kvmstorage1 kernel: block drbd0: Terminating receiver thread


florian at hastexo

May 13, 2012, 6:50 AM

Post #2 of 2 (341 views)
Permalink
Re: servers out of sync [In reply to]

On Sun, May 13, 2012 at 3:25 PM, Marcel Kraan <marcel [at] kraan> wrote:
> i don't get it synced again.
> they are now both stand alone?
> i can ping them both.
>
> don't  have any options left.

Yes you do.

http://www.drbd.org/users-guide-8.3/s-resolve-split-brain.html

Googling this log message would have led you there:

> May 13 15:17:16 kvmstorage1 kernel: block drbd0: Split-Brain detected but
> unresolved, dropping connection!

Cheers,
Florian

--
Need help with High Availability?
http://www.hastexo.com/now
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.