Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

What does "block drbd0: BAD! sector= .... cstate=SyncSource" want to tell me?

 

 

DRBD users RSS feed   Index | Next | Previous | View Threaded


lvml at 5t9

Jul 4, 2012, 11:14 AM

Post #1 of 4 (714 views)
Permalink
What does "block drbd0: BAD! sector= .... cstate=SyncSource" want to tell me?

Hi,

I just had to reboot a system that is configured as the "secondary" for 3 DRBD devices.
After the reboot, connection to the primary system was established and re-synchronisation started.

Some scary messages were emitted during that process - on the primary:

> block drbd0: uuid_compare()=1 by rule 70
> block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
> block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
> block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
> block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
> block drbd0: updated sync UUID 0AE...
> block drbd0: Began resync as SyncSource (will sync 3242376 KB [810594 bits set]).
> block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncSource
> block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncSource
> block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0 count=32 cstate=SyncSource
> block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0 count=32 cstate=SyncSource
> block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> block drbd0: updated UUIDs 0AE...
> block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
> block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.

And on the (rebooted) secondary:

> block drbd0: disk( Diskless -> Attaching )
> block drbd0: max BIO size = 131072
> block drbd0: drbd_bm_resize called with capacity == 3550894184
> block drbd0: resync bitmap: bits=443861773 words=6935341 pages=13546
> block drbd0: size = 1693 GB (1775447092 KB)
> block drbd0: bitmap READ of 13546 pages took 1443 jiffies
> block drbd0: recounting of set bits took additional 34 jiffies
> block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> block drbd0: disk( Attaching -> UpToDate )
> block drbd0: attached to UUIDs 9DE...
> block drbd0: drbd_sync_handshake:
> block drbd0: self 9DE... bits:0 flags:0
> block drbd0: peer 0AE... bits:810473 flags:0
> block drbd0: uuid_compare()=-1 by rule 50
> block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
> block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> block drbd0: conn( WFBitMapT -> WFSyncUUID )
> block drbd0: updated sync uuid 9DE...
> block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
> block drbd0: Began resync as SyncTarget (will sync 3242376 KB [810594 bits set]).
> block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncTarget
> block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncTarget
> block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0 count=32 cstate=SyncTarget
> block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0 count=32 cstate=SyncTarget
> block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> block drbd0: updated UUIDs 0AE...
> block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
> block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0
> block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
> block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.

Now I wonder: What does drbd0 want to tell me with those "BAD! ..." messages?

It seems to have completed the synchronization successfully. Also, no "read errors" where
reported in on either host.

Should I be concerned about the data integrity, now?

Regards,

Lutz Vieweg

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lars.ellenberg at linbit

Jul 5, 2012, 10:00 AM

Post #2 of 4 (682 views)
Permalink
Re: What does "block drbd0: BAD! sector= .... cstate=SyncSource" want to tell me? [In reply to]

On Wed, Jul 04, 2012 at 08:14:05PM +0200, Lutz Vieweg wrote:
> Hi,
>
> I just had to reboot a system that is configured as the "secondary" for 3 DRBD devices.
> After the reboot, connection to the primary system was established and re-synchronisation started.
>
> Some scary messages were emitted during that process - on the primary:
>
> >block drbd0: uuid_compare()=1 by rule 70
> >block drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapS ) pdsk( DUnknown -> Consistent )
> >block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0
> >block drbd0: helper command: /sbin/drbdadm before-resync-source minor-0 exit code 0 (0x0)
> >block drbd0: conn( WFBitMapS -> SyncSource ) pdsk( Consistent -> Inconsistent )
> >block drbd0: updated sync UUID 0AE...
> >block drbd0: Began resync as SyncSource (will sync 3242376 KB [810594 bits set]).
> >block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0 count=32 cstate=SyncSource
> >block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> >block drbd0: updated UUIDs 0AE...
> >block drbd0: conn( SyncSource -> Connected ) pdsk( Inconsistent -> UpToDate )
> >block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>
> And on the (rebooted) secondary:
>
> >block drbd0: disk( Diskless -> Attaching )
> >block drbd0: max BIO size = 131072
> >block drbd0: drbd_bm_resize called with capacity == 3550894184
> >block drbd0: resync bitmap: bits=443861773 words=6935341 pages=13546
> >block drbd0: size = 1693 GB (1775447092 KB)
> >block drbd0: bitmap READ of 13546 pages took 1443 jiffies
> >block drbd0: recounting of set bits took additional 34 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
> >block drbd0: disk( Attaching -> UpToDate )
> >block drbd0: attached to UUIDs 9DE...
> >block drbd0: drbd_sync_handshake:
> >block drbd0: self 9DE... bits:0 flags:0
> >block drbd0: peer 0AE... bits:810473 flags:0
> >block drbd0: uuid_compare()=-1 by rule 50
> >block drbd0: peer( Unknown -> Primary ) conn( WFReportParams -> WFBitMapT ) disk( UpToDate -> Outdated ) pdsk( DUnknown -> UpToDate )
> >block drbd0: receive bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: send bitmap stats [Bytes(packets)]: plain 0(0), RLE 68591(17), total 68591; compression: 99.9%
> >block drbd0: conn( WFBitMapT -> WFSyncUUID )
> >block drbd0: updated sync uuid 9DE...
> >block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0
> >block drbd0: helper command: /sbin/drbdadm before-resync-target minor-0 exit code 0 (0x0)
> >block drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent )
> >block drbd0: Began resync as SyncTarget (will sync 3242376 KB [810594 bits set]).
> >block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=3393683440s enr=103566 rs_left=-30 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: BAD! sector=3062103824s enr=93447 rs_left=-2 rs_failed=0 count=32 cstate=SyncTarget
> >block drbd0: Resync done (total 337 sec; paused 0 sec; 9620 K/sec)
> >block drbd0: updated UUIDs 0AE...
> >block drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate )
> >block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0
> >block drbd0: helper command: /sbin/drbdadm after-resync-target minor-0 exit code 0 (0x0)
> >block drbd0: bitmap WRITE of 0 pages took 1 jiffies
> >block drbd0: 0 KB (0 bits) marked out-of-sync by on disk bit-map.
>
> Now I wonder: What does drbd0 want to tell me with those "BAD! ..." messages?

It's just some reference counter that should not have gone negative,
but did, because we forgot to update/reinitialize it at some stage.

Depending on your exact DRBD version, I could tell you various things
about this. But if you run 8.3 git it is supposed to be fixed, finally...

> It seems to have completed the synchronization successfully. Also, no "read errors" where
> reported in on either host.
>
> Should I be concerned about the data integrity, now?

Nope. All good.

Cheers,

Lars

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lvml at 5t9

Jul 5, 2012, 10:38 AM

Post #3 of 4 (670 views)
Permalink
Re: What does "block drbd0: BAD! sector= .... cstate=SyncSource" want to tell me? [In reply to]

On 07/05/2012 07:00 PM, Lars Ellenberg wrote:
> On Wed, Jul 04, 2012 at 08:14:05PM +0200, Lutz Vieweg wrote:
>>> block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncSource
>>> block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncSource
>> Now I wonder: What does drbd0 want to tell me with those "BAD! ..." messages?
>
> It's just some reference counter that should not have gone negative,
> but did, because we forgot to update/reinitialize it at some stage.

Ah, ok.

> Depending on your exact DRBD version, I could tell you various things
> about this. But if you run 8.3 git it is supposed to be fixed, finally...

I'm running
version: 8.4.1 (api:1/proto:86-100)
GIT-hash: 91b4c048c1a0e06777b5f65d312b38d47abaea80
on both sides.

>> It seems to have completed the synchronization successfully. Also, no "read errors" where
>> reported in on either host.
>>
>> Should I be concerned about the data integrity, now?
>
> Nope. All good.

Great!

Regards,

Lutz Vieweg

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


lars.ellenberg at linbit

Jul 9, 2012, 11:49 PM

Post #4 of 4 (653 views)
Permalink
Re: What does "block drbd0: BAD! sector= .... cstate=SyncSource" want to tell me? [In reply to]

On Thu, Jul 05, 2012 at 07:38:30PM +0200, Lutz Vieweg wrote:
> On 07/05/2012 07:00 PM, Lars Ellenberg wrote:
> >On Wed, Jul 04, 2012 at 08:14:05PM +0200, Lutz Vieweg wrote:
> >>>block drbd0: BAD! sector=2513600376s enr=76708 rs_left=-15 rs_failed=0 count=32 cstate=SyncSource
> >>>block drbd0: BAD! sector=2342485912s enr=71486 rs_left=-19 rs_failed=0 count=32 cstate=SyncSource
> >>Now I wonder: What does drbd0 want to tell me with those "BAD! ..." messages?
> >
> >It's just some reference counter that should not have gone negative,
> >but did, because we forgot to update/reinitialize it at some stage.
>
> Ah, ok.
>
> >Depending on your exact DRBD version, I could tell you various things
> >about this. But if you run 8.3 git it is supposed to be fixed, finally...
>
> I'm running
> version: 8.4.1 (api:1/proto:86-100)
> GIT-hash: 91b4c048c1a0e06777b5f65d312b38d47abaea80
> on both sides.

Well, fix will be in 8.4.2, too.
Which should have been out there since a few weeks already.
We are working on that...

> >>It seems to have completed the synchronization successfully. Also, no "read errors" where
> >>reported in on either host.
> >>
> >>Should I be concerned about the data integrity, now?
> >
> >Nope. All good.
>
> Great!
>
> Regards,
>
> Lutz Vieweg

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.