Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: DRBD: Users

DRBD - one half of Proxmox cluster miscommunicating

 

 

First page Previous page 1 2 Next page Last page  View All DRBD users RSS feed   Index | Next | Previous | View Threaded


chibi at gol

Jul 31, 2012, 9:30 PM

Post #26 of 30 (523 views)
Permalink
Re: DRBD - one half of Proxmox cluster miscommunicating [In reply to]

On Tue, 31 Jul 2012 16:54:48 +0100 James Gibbon wrote:

> On Tue, 31 Jul 2012 17:21:14 +0200
> Felix Frank <ff [at] mpexnet> wrote:
>
> > On 07/31/2012 05:19 PM, James Gibbon wrote:
> > > Is that right?
> >
> > Yes.
> >
> > I'd still consider losing the split brain resolution option.
> >
>
> Many thanks, Felix. This has all been a useful learning experience
> as well as a problem resolution resource. I'm going to run the fix
> shortly.
>
> One last question: why does /proc/drbd on the broken node show as
> "UpToDate" in /proc/drbd? Is that because it's a primary, and not
> able to communicate with the other side, therefore it's entitled to
> consider itself current?
>
Yes, in its little universe, it is the king. ^o^

The "drbadm connect --discard-my-data all" will take care of that false
assumption. ^.^

Regards,

Christian
--
Christian Balzer Network/Systems Engineer
chibi [at] gol Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


jg at jamesgibbon

Aug 1, 2012, 1:53 AM

Post #27 of 30 (498 views)
Permalink
Re: DRBD - one half of Proxmox cluster miscommunicating [In reply to]

On Wed, 1 Aug 2012 13:30:57 +0900
Christian Balzer <chibi [at] gol> wrote:

> >
> Yes, in its little universe, it is the king. ^o^
>
> The "drbadm connect --discard-my-data all" will take care of
> that false assumption. ^.^
>

Thanks Christian.

I did post an account of the fix last night to the list just to
wrap up the thread - it didn't quite go to plan unfortunately,
but everything worked out nicely in the end.

It hasn't made it through to the list but if it doesn't turn up,
I'll resend it.

James



--

_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


james.gibbon at virgin

Aug 1, 2012, 1:54 AM

Post #28 of 30 (503 views)
Permalink
Re: DRBD - one half of Proxmox cluster miscommunicating [In reply to]

OK. This didn't quite go to plan.

I assigned the proper IP address to the DRDB NIC on the secondary
successfully.

But:

Firstly my version of drbdadm doesn't support "--discard-my-data"

# "drbdadm connect --discard-my-data all"
drbdadm: unrecognized option `--discard-my-data'
try 'drbdadm help'

A bit of Googling suggested that this might help:

# drbdadm -- --discard-my-data connect all

- allegedly to pass the option straight through to drbdsetup.

But that gave an error - complaining that:

0: Failure: (123) --discard-my-data not allowed when primary.

So I then tried:

# drbdadm secondary all
0: State change failed: (-12) Device is held open by someone
Command 'drbdsetup 0 secondary' terminated with exit code 11
1: State change failed: (-12) Device is held open by someone
Command 'drbdsetup 1 secondary' terminated with exit code 11
pves2:/etc/network#


Google then suggested:

# vgchange -an <volume group>

.. so I ran that on drbdvg and drbdvg1.

Then "drbdadm secondary all" worked successfully, following
which, "drbdadm connect --discard-my-data all" was also
happy to run.

I watched /proc/drbd while the mirror synced up - it displays
nice little progress bars like this:

# cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:1253940 nr:0 dw:41622456 dr:181039187 al:519664 bm:519738 lo:18 pe:223 ua:60 ap:8 ep:1 wo:b oos:10332000
[=>..................] sync'ed: 10.8% (10088/11300)M
finish: 0:17:43 speed: 9,672 (12,680) K/sec
1: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r----
ns:1296696 nr:0 dw:4851300 dr:17169105 al:1699 bm:1946 lo:0 pe:120 ua:0 ap:8 ep:1 wo:b oos:742860
[===========>........] sync'ed: 63.6% (742860/2034536)K
finish: 0:00:46 speed: 16,108 (13,180) K/sec

.. and it completed successfully. Comfortingly, the VMs on the first node
continued to work properly.

Anyway .. since it was now in Primary/Secondary and the logical volumes were
unavailable, I rebooted the second box.

And finally,

# cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
srcversion: EE47D8BF18AC166BE219757
0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
ns:11689952 nr:0 dw:41746996 dr:191403115 al:522263 bm:522764 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r----
ns:2073056 nr:0 dw:4888628 dr:17917209 al:1706 bm:2088 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
#

Deep joy.

Many thanks for the help, and I hope this thread will prove useful to
some other victim of their career choice at some point in the future.

James



_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


chibi at gol

Aug 1, 2012, 2:03 AM

Post #29 of 30 (496 views)
Permalink
Re: DRBD - one half of Proxmox cluster miscommunicating [In reply to]

On Wed, 1 Aug 2012 09:53:44 +0100 James Gibbon wrote:

> On Wed, 1 Aug 2012 13:30:57 +0900
> Christian Balzer <chibi [at] gol> wrote:
>
> > >
> > Yes, in its little universe, it is the king. ^o^
> >
> > The "drbadm connect --discard-my-data all" will take care of
> > that false assumption. ^.^
> >
>
> Thanks Christian.
>
> I did post an account of the fix last night to the list just to
> wrap up the thread - it didn't quite go to plan unfortunately,
> but everything worked out nicely in the end.
>
I saw that just now, that command up there will only work with 8.4, and I
assumed that this was the version you're using from the previous
suggestions by Felix. ^.^

Good to hear/see it worked out in the end.

Christian
--
Christian Balzer Network/Systems Engineer
chibi [at] gol Global OnLine Japan/Fusion Communications
http://www.gol.com/
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user


ff at mpexnet

Aug 1, 2012, 2:04 AM

Post #30 of 30 (510 views)
Permalink
Re: DRBD - one half of Proxmox cluster miscommunicating [In reply to]

On 08/01/2012 10:54 AM, James Gibbon wrote:
> 0: Failure: (123) --discard-my-data not allowed when primary.

Right. Guess I got a little over-excited there ;)

Glad it worked out for you in the end. Good google-work. If this keeps
happening to you, I recommend booking Linbit training at some point.
It's not cheap, but it's great value!

Regards,
Felix
_______________________________________________
drbd-user mailing list
drbd-user [at] lists
http://lists.linbit.com/mailman/listinfo/drbd-user

First page Previous page 1 2 Next page Last page  View All DRBD users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.