Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

cman+pacemaker+dual-primary drbd does not promote

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


seligman at nevis

Jan 30, 2012, 2:42 PM

Post #1 of 14 (1636 views)
Permalink
cman+pacemaker+dual-primary drbd does not promote

I'm trying to follow the directions for setting up a dual-primary DRBD setup
with CMAN and Pacemaker. I'm stuck at an annoying spot: Pacemaker won't promote
the DRBD resources to primary at either node.


Here's the result of crm_mon:

Last updated: Mon Jan 30 17:07:03 2012
Stack: cman
Current DC: hypatia-tb - partition with quorum
Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
2 Nodes configured, unknown expected votes
2 Resources configured.
============

Online: [ orestes-tb hypatia-tb ]

Master/Slave Set: AdminClone [AdminDrbd]
Slaves: [ hypatia-tb orestes-tb ]



/etc/cluster/cluster.conf:

<cluster config_version="6" name="Nevis_HA">
<logging debug="off"/>
<cman expected_votes="1" two_node="1" />
<clusternodes>
<clusternode name="hypatia-tb" nodeid="1">
<fence>
<method name="pcmk-redirect">
<device name="pcmk" port="hypatia-tb"/>
</method>
</fence>
</clusternode>
<clusternode name="orestes-tb" nodeid="2">
<fence>
<method name="pcmk-redirect">
<device name="pcmk" port="orestes-tb"/>
</method>
</fence>
</clusternode>
</clusternodes>
<fencedevices>
<fencedevice name="pcmk" agent="fence_pcmk"/>
</fencedevices>
<!-- <fence_daemon post_join_delay="30" /> -->
</cluster>


crm configure show:

node hypatia-tb
node orestes-tb
primitive AdminDrbd ocf:linbit:drbd \
params drbd_resource="admin" \
op monitor interval="60s" role="Master" \
op stop interval="0" timeout="320" \
op start interval="0" timeout="240"
primitive Clvmd lsb:clvmd
ms AdminClone AdminDrbd \
meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"
notify="true"
clone ClvmdClone Clvmd
colocation ClvmdWithAdmin inf: ClvmdClone AdminClone:Master
order AdminBeforeClvmd inf: AdminClone:promote ClvmdClone:start
property $id="cib-bootstrap-options" \
dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
cluster-infrastructure="cman" \
stonith-enabled="false"


DRBD looks OK:

# cat /proc/drbd
version: 8.4.0 (api:1/proto:86-100)
GIT-hash: 28753f559ab51b549d16bcf487fe625d5919c49c build by gardner@, 2012-01-25
19:10:28
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0


I can manually do "drbdadm primary admin" on both nodes and get a
Primary/Primary state. That still does not get Pacemaker to promote the resource.


The only vaguely relevant lines in /var/log/messages seem to be:

Jan 30 17:38:13 hypatia-tb lrmd: [11260]: info: RA output
(AdminDrbd:0:start:stdout)
Jan 30 17:38:13 hypatia-tb lrmd: [11260]: info: RA output:
(AdminDrbd:0:start:stderr) Could not map uname=hypatia-tb.nevis.columbia.edu to
a UUID: The object/attribute does not exist
Jan 30 17:38:13 hypatia-tb lrmd: [11260]: info: RA output
(AdminDrbd:0:start:stdout)


I've tried running with iptables both on and off, and the results are the same.


Any clues?
--
Bill Seligman | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
PO Box 137 |
Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
Attachments: smime.p7s (4.39 KB)


emi2fast at gmail

Jan 30, 2012, 3:12 PM

Post #2 of 14 (1590 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

Sorry William

But if you wanna implement dual primary i think you don't nee promote for
your drbd
Try to use clone without master/slave

2012/1/30 William Seligman <seligman [at] nevis>

> I'm trying to follow the directions for setting up a dual-primary DRBD
> setup
> with CMAN and Pacemaker. I'm stuck at an annoying spot: Pacemaker won't
> promote
> the DRBD resources to primary at either node.
>
>
> Here's the result of crm_mon:
>
> Last updated: Mon Jan 30 17:07:03 2012
> Stack: cman
> Current DC: hypatia-tb - partition with quorum
> Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
> ============
>
> Online: [ orestes-tb hypatia-tb ]
>
> Master/Slave Set: AdminClone [AdminDrbd]
> Slaves: [ hypatia-tb orestes-tb ]
>
>
>
> /etc/cluster/cluster.conf:
>
> <cluster config_version="6" name="Nevis_HA">
> <logging debug="off"/>
> <cman expected_votes="1" two_node="1" />
> <clusternodes>
> <clusternode name="hypatia-tb" nodeid="1">
> <fence>
> <method name="pcmk-redirect">
> <device name="pcmk" port="hypatia-tb"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="orestes-tb" nodeid="2">
> <fence>
> <method name="pcmk-redirect">
> <device name="pcmk" port="orestes-tb"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <fencedevices>
> <fencedevice name="pcmk" agent="fence_pcmk"/>
> </fencedevices>
> <!-- <fence_daemon post_join_delay="30" /> -->
> </cluster>
>
>
> crm configure show:
>
> node hypatia-tb
> node orestes-tb
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master" \
> op stop interval="0" timeout="320" \
> op start interval="0" timeout="240"
> primitive Clvmd lsb:clvmd
> ms AdminClone AdminDrbd \
> meta master-max="2" master-node-max="1" clone-max="2"
> clone-node-max="1"
> notify="true"
> clone ClvmdClone Clvmd
> colocation ClvmdWithAdmin inf: ClvmdClone AdminClone:Master
> order AdminBeforeClvmd inf: AdminClone:promote ClvmdClone:start
> property $id="cib-bootstrap-options" \
> dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
> cluster-infrastructure="cman" \
> stonith-enabled="false"
>
>
> DRBD looks OK:
>
> # cat /proc/drbd
> version: 8.4.0 (api:1/proto:86-100)
> GIT-hash: 28753f559ab51b549d16bcf487fe625d5919c49c build by gardner@,
> 2012-01-25
> 19:10:28
> 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
>
>
> I can manually do "drbdadm primary admin" on both nodes and get a
> Primary/Primary state. That still does not get Pacemaker to promote the
> resource.
>
>
> The only vaguely relevant lines in /var/log/messages seem to be:
>
> Jan 30 17:38:13 hypatia-tb lrmd: [11260]: info: RA output
> (AdminDrbd:0:start:stdout)
> Jan 30 17:38:13 hypatia-tb lrmd: [11260]: info: RA output:
> (AdminDrbd:0:start:stderr) Could not map uname=
> hypatia-tb.nevis.columbia.edu to
> a UUID: The object/attribute does not exist
> Jan 30 17:38:13 hypatia-tb lrmd: [11260]: info: RA output
> (AdminDrbd:0:start:stdout)
>
>
> I've tried running with iptables both on and off, and the results are the
> same.
>
>
> Any clues?
> --
> Bill Seligman | Phone: (914) 591-2823
> Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
> PO Box 137 |
> Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



--
esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


arnold at arnoldarts

Jan 30, 2012, 3:36 PM

Post #3 of 14 (1582 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Tuesday 31 January 2012 00:12:52 emmanuel segura wrote:
> But if you wanna implement dual primary i think you don't nee promote for
> your drbd
> Try to use clone without master/slave

At least when you use the linbit-ra, using it without a master-clone will give
you one(!) slave only. When you use a normal clone with two clones, you will
get two slaves. The RA only goes primary on "promote", that is when its in
master-state. => You need a master-clone of two clones with 1-2 masters to use
drbd in the cluster.

Have fun,

Arnold
Attachments: signature.asc (0.19 KB)


lars.ellenberg at linbit

Jan 31, 2012, 2:47 AM

Post #4 of 14 (1569 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Mon, Jan 30, 2012 at 05:42:34PM -0500, William Seligman wrote:
> I'm trying to follow the directions for setting up a dual-primary DRBD setup
> with CMAN and Pacemaker. I'm stuck at an annoying spot: Pacemaker won't promote
> the DRBD resources to primary at either node.
>
>
> Here's the result of crm_mon:
>
> Last updated: Mon Jan 30 17:07:03 2012
> Stack: cman
> Current DC: hypatia-tb - partition with quorum
> Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
> 2 Nodes configured, unknown expected votes
> 2 Resources configured.
> ============
>
> Online: [ orestes-tb hypatia-tb ]
>
> Master/Slave Set: AdminClone [AdminDrbd]
> Slaves: [ hypatia-tb orestes-tb ]

> crm configure show:
>
> node hypatia-tb
> node orestes-tb
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master" \

You are missing an additional monitor op for role=Slave
make sure it has a different interval than the one for role=Master.

e.g.
op monitor interval="59s" role="Slave" \

> op stop interval="0" timeout="320" \
> op start interval="0" timeout="240"
> primitive Clvmd lsb:clvmd
> ms AdminClone AdminDrbd \
> meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"
> notify="true"
> clone ClvmdClone Clvmd
> colocation ClvmdWithAdmin inf: ClvmdClone AdminClone:Master
> order AdminBeforeClvmd inf: AdminClone:promote ClvmdClone:start
> property $id="cib-bootstrap-options" \
> dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
> cluster-infrastructure="cman" \
> stonith-enabled="false"

Also remember that, for dual-primary DRBD,
working and tested fencing both on cluster level (stonith) and on drbd
level (fence-peer) is mandatory.
Unless you don't care for data integrity.

>
>
> DRBD looks OK:
>
> # cat /proc/drbd
> version: 8.4.0 (api:1/proto:86-100)
> GIT-hash: 28753f559ab51b549d16bcf487fe625d5919c49c build by gardner@, 2012-01-25
> 19:10:28
> 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0


--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


seligman at nevis

Jan 31, 2012, 9:08 AM

Post #5 of 14 (1567 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Tue, 31 Jan 2012 00:36:23 Arnold Krille wrote:

> On Tuesday 31 January 2012 00:12:52 emmanuel segura wrote:

>> But if you wanna implement dual primary i think you don't nee promote for
>> your drbd
>> Try to use clone without master/slave

> At least when you use the linbit-ra, using it without a master-clone will give
> you one(!) slave only. When you use a normal clone with two clones, you will
> get two slaves. The RA only goes primary on "promote", that is when its in
> master-state. => You need a master-clone of two clones with 1-2 masters to use
> drbd in the cluster.

If I understand Emmanual's suggestion: The only way I know how to implement this
is to create a simple clone group with lsb::drbd instead of Linbit's drbd
resource, and put "become-primary-on" for both my nodes in drbd.conf.

This might work in the short term, but I think it's risky in the long term. For
example: Something goes wrong and node A stoniths node B. I bring node B back
up, disabling cman+pacemaker before I do so, and want to re-sync node B's DRBD
partition with A. If I'm stupid (occupational hazard), I won't remember to edit
drbd.conf before I do this, node B will automatically try to become primary, and
probably get stonith'ed again.


Arnold: I thought that was what I was doing with these statements:

primitive AdminDrbd ocf:linbit:drbd \
params drbd_resource="admin" \
op monitor interval="60s" role="Master" \
op stop interval="0" timeout="320" \
op start interval="0" timeout="240"

ms AdminClone AdminDrbd \
meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"


That is, master-max="2" means to promote two instances to master. Did I get it
wrong?
--
Bill Seligman | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
PO Box 137 |
Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
Attachments: smime.p7s (4.39 KB)


seligman at nevis

Jan 31, 2012, 9:55 AM

Post #6 of 14 (1573 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Tue Jan 31 03:47:11 MST 2012 Lars Ellenberg wrote:

> On Mon, Jan 30, 2012 at 05:42:34PM -0500, William Seligman wrote:
>> I'm trying to follow the directions for setting up a dual-primary DRBD setup
>> with CMAN and Pacemaker. I'm stuck at an annoying spot: Pacemaker won't promote
>> the DRBD resources to primary at either node.
>>
>>
>> Here's the result of crm_mon:
>>
>> Last updated: Mon Jan 30 17:07:03 2012
>> Stack: cman
>> Current DC: hypatia-tb - partition with quorum
>> Version: 1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f
>> 2 Nodes configured, unknown expected votes
>> 2 Resources configured.
>> ============
>>
>> Online: [ orestes-tb hypatia-tb ]
>>
>> Master/Slave Set: AdminClone [AdminDrbd]
>> Slaves: [ hypatia-tb orestes-tb ]
>
>> crm configure show:
>>
>> node hypatia-tb
>> node orestes-tb
>> primitive AdminDrbd ocf:linbit:drbd \
>> params drbd_resource="admin" \
>> op monitor interval="60s" role="Master" \
>
> You are missing an additional monitor op for role=Slave
> make sure it has a different interval than the one for role=Master.
>
> e.g.
> op monitor interval="59s" role="Slave" \
>
>> op stop interval="0" timeout="320" \
>> op start interval="0" timeout="240"

I put that in, but it didn't change my basic problem: Neither instance of
AdminDrbd is promoted on either node.

>> primitive Clvmd lsb:clvmd
>> ms AdminClone AdminDrbd \
>> meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"
>> notify="true"
>> clone ClvmdClone Clvmd
>> colocation ClvmdWithAdmin inf: ClvmdClone AdminClone:Master
>> order AdminBeforeClvmd inf: AdminClone:promote ClvmdClone:start
>> property $id="cib-bootstrap-options" \
>> dc-version="1.1.5-5.el6-01e86afaaa6d4a8c4836f68df80ababd6ca3902f" \
>> cluster-infrastructure="cman" \
>> stonith-enabled="false"
>
> Also remember that, for dual-primary DRBD,
> working and tested fencing both on cluster level (stonith) and on drbd
> level (fence-peer) is mandatory.
> Unless you don't care for data integrity.

I'll get to that. I'm just starting out on this configuration. I don't want to
put in STONITH just yet, otherwise I'll have to do recovery after every typo.
I'll put in STONITH and test it when I get to installing the KVM resources. But
until I solve this problem, I can't get to that stage.

>> DRBD looks OK:
>>
>> # cat /proc/drbd
>> version: 8.4.0 (api:1/proto:86-100)
>> GIT-hash: 28753f559ab51b549d16bcf487fe625d5919c49c build by gardner@, 2012-01-25
>> 19:10:28
>> 0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
>> ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

Any clues as to what I can look at to track the source of the problem?

--
Bill Seligman | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
PO Box 137 |
Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
Attachments: smime.p7s (4.39 KB)


emi2fast at gmail

Jan 31, 2012, 12:47 PM

Post #7 of 14 (1567 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

William try to follow the suggestion of Arnold

In my case it's different because we don't use drbd we are using SAN with
ocfs2

But i think for drbd in dual primary you need the attribute master-max="2"

2012/1/31 William Seligman <seligman [at] nevis>

> On Tue, 31 Jan 2012 00:36:23 Arnold Krille wrote:
>
> > On Tuesday 31 January 2012 00:12:52 emmanuel segura wrote:
>
> >> But if you wanna implement dual primary i think you don't nee promote
> for
> >> your drbd
> >> Try to use clone without master/slave
>
> > At least when you use the linbit-ra, using it without a master-clone
> will give
> > you one(!) slave only. When you use a normal clone with two clones, you
> will
> > get two slaves. The RA only goes primary on "promote", that is when its
> in
> > master-state. => You need a master-clone of two clones with 1-2 masters
> to use
> > drbd in the cluster.
>
> If I understand Emmanual's suggestion: The only way I know how to
> implement this
> is to create a simple clone group with lsb::drbd instead of Linbit's drbd
> resource, and put "become-primary-on" for both my nodes in drbd.conf.
>
> This might work in the short term, but I think it's risky in the long
> term. For
> example: Something goes wrong and node A stoniths node B. I bring node B
> back
> up, disabling cman+pacemaker before I do so, and want to re-sync node B's
> DRBD
> partition with A. If I'm stupid (occupational hazard), I won't remember to
> edit
> drbd.conf before I do this, node B will automatically try to become
> primary, and
> probably get stonith'ed again.
>
>
> Arnold: I thought that was what I was doing with these statements:
>
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master" \
> op stop interval="0" timeout="320" \
> op start interval="0" timeout="240"
>
> ms AdminClone AdminDrbd \
> meta master-max="2" master-node-max="1" clone-max="2"
> clone-node-max="1"
>
>
> That is, master-max="2" means to promote two instances to master. Did I
> get it
> wrong?
> --
> Bill Seligman | Phone: (914) 591-2823
> Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
> PO Box 137 |
> Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



--
esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


seligman at nevis

Jan 31, 2012, 12:54 PM

Post #8 of 14 (1581 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On 1/31/12 3:47 PM, emmanuel segura wrote:

> William try to follow the suggestion of Arnold
>
> In my case it's different because we don't use drbd we are using SAN with
> ocfs2
>
> But i think for drbd in dual primary you need the attribute master-max="2"

I did, or thought I did. Have I missed something? Again, from "crm configure show":

primitive AdminDrbd ocf:linbit:drbd \
params drbd_resource="admin" \
op monitor interval="60s" role="Master" \
op monitor interval="59s" role="Slave" \
op stop interval="0" timeout="320" \
op start interval="0" timeout="240"

ms AdminClone AdminDrbd \
meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"

Still no promotion to primary on either node.

>
> 2012/1/31 William Seligman <seligman [at] nevis>
>
>> On Tue, 31 Jan 2012 00:36:23 Arnold Krille wrote:
>>
>>> On Tuesday 31 January 2012 00:12:52 emmanuel segura wrote:
>>
>>>> But if you wanna implement dual primary i think you don't nee promote
>>>> for your drbd Try to use clone without master/slave
>>
>>> At least when you use the linbit-ra, using it without a master-clone will
>>> give you one(!) slave only. When you use a normal clone with two clones,
>>> you will get two slaves. The RA only goes primary on "promote", that is
>>> when its in master-state. => You need a master-clone of two clones with
>>> 1-2 masters to use drbd in the cluster.
>>
>> If I understand Emmanual's suggestion: The only way I know how to implement
>> this is to create a simple clone group with lsb::drbd instead of Linbit's
>> drbd resource, and put "become-primary-on" for both my nodes in drbd.conf.
>>
>> This might work in the short term, but I think it's risky in the long term.
>> For example: Something goes wrong and node A stoniths node B. I bring node
>> B back up, disabling cman+pacemaker before I do so, and want to re-sync
>> node B's DRBD partition with A. If I'm stupid (occupational hazard), I
>> won't remember to edit drbd.conf before I do this, node B will
>> automatically try to become primary, and probably get stonith'ed again.>>
>>
>> Arnold: I thought that was what I was doing with these statements:
>>
>> primitive AdminDrbd ocf:linbit:drbd \
>> params drbd_resource="admin" \
>> op monitor interval="60s" role="Master" \
>> op stop interval="0" timeout="320" \
>> op start interval="0" timeout="240"
>>
>> ms AdminClone AdminDrbd \
>> meta master-max="2" master-node-max="1" clone-max="2"
>> clone-node-max="1"
>>
>>
>> That is, master-max="2" means to promote two instances to master. Did I get
>> it wrong?

--
Bill Seligman | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
PO Box 137 |
Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
Attachments: smime.p7s (4.39 KB)


emi2fast at gmail

Jan 31, 2012, 1:11 PM

Post #9 of 14 (1571 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

William can you try like this

primitive AdminDrbd ocf:linbit:drbd \
params drbd_resource="admin" \
op monitor interval="60s" role="Master"

clone Adming AdminDrbd

2012/1/31 William Seligman <seligman [at] nevis>

> On 1/31/12 3:47 PM, emmanuel segura wrote:
>
> > William try to follow the suggestion of Arnold
> >
> > In my case it's different because we don't use drbd we are using SAN with
> > ocfs2
> >
> > But i think for drbd in dual primary you need the attribute
> master-max="2"
>
> I did, or thought I did. Have I missed something? Again, from "crm
> configure show":
>
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master" \
> op monitor interval="59s" role="Slave" \
> op stop interval="0" timeout="320" \
> op start interval="0" timeout="240"
>
> ms AdminClone AdminDrbd \
> meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"
>
> Still no promotion to primary on either node.
>
> >
> > 2012/1/31 William Seligman <seligman [at] nevis>
> >
> >> On Tue, 31 Jan 2012 00:36:23 Arnold Krille wrote:
> >>
> >>> On Tuesday 31 January 2012 00:12:52 emmanuel segura wrote:
> >>
> >>>> But if you wanna implement dual primary i think you don't nee promote
> >>>> for your drbd Try to use clone without master/slave
> >>
> >>> At least when you use the linbit-ra, using it without a master-clone
> will
> >>> give you one(!) slave only. When you use a normal clone with two
> clones,
> >>> you will get two slaves. The RA only goes primary on "promote", that is
> >>> when its in master-state. => You need a master-clone of two clones with
> >>> 1-2 masters to use drbd in the cluster.
> >>
> >> If I understand Emmanual's suggestion: The only way I know how to
> implement
> >> this is to create a simple clone group with lsb::drbd instead of
> Linbit's
> >> drbd resource, and put "become-primary-on" for both my nodes in
> drbd.conf.
> >>
> >> This might work in the short term, but I think it's risky in the long
> term.
> >> For example: Something goes wrong and node A stoniths node B. I bring
> node
> >> B back up, disabling cman+pacemaker before I do so, and want to re-sync
> >> node B's DRBD partition with A. If I'm stupid (occupational hazard), I
> >> won't remember to edit drbd.conf before I do this, node B will
> >> automatically try to become primary, and probably get stonith'ed
> again.>>
> >>
> >> Arnold: I thought that was what I was doing with these statements:
> >>
> >> primitive AdminDrbd ocf:linbit:drbd \
> >> params drbd_resource="admin" \
> >> op monitor interval="60s" role="Master" \
> >> op stop interval="0" timeout="320" \
> >> op start interval="0" timeout="240"
> >>
> >> ms AdminClone AdminDrbd \
> >> meta master-max="2" master-node-max="1" clone-max="2"
> >> clone-node-max="1"
> >>
> >>
> >> That is, master-max="2" means to promote two instances to master. Did I
> get
> >> it wrong?
>
> --
> Bill Seligman | Phone: (914) 591-2823
> Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
> PO Box 137 |
> Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>



--
esta es mi vida e me la vivo hasta que dios quiera
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


lars.ellenberg at linbit

Jan 31, 2012, 1:17 PM

Post #10 of 14 (1568 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Tue, Jan 31, 2012 at 10:11:23PM +0100, emmanuel segura wrote:
> William can you try like this
>
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master"
>
> clone Adming AdminDrbd

Nope.

--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


seligman at nevis

Jan 31, 2012, 1:26 PM

Post #11 of 14 (1594 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On 1/31/12 4:11 PM, emmanuel segura wrote:
> William can you try like this
>
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master"
>
> clone Adming AdminDrbd

Both Arnold and Lars said this wouldn't work. I just tried it. They were right.

Is there anything at all to the log message:

Jan 31 16:20:54 orestes-tb lrmd: [12231]: info: RA output:
(AdminDrbd:1:monitor:stderr) Could not map uname=orestes-tb.nevis.columbia.edu
to a UUID: The object/attribute does not exist

That's been syslog'ed every 59 seconds since I updated AdminDrbd as Lars suggested:

primitive AdminDrbd ocf:linbit:drbd \
params drbd_resource="admin" \
op monitor interval="60s" role="Master" \
op monitor interval="59s" role="Slave" \
op stop interval="0" timeout="320" \
op start interval="0" timeout="240"


> 2012/1/31 William Seligman <seligman [at] nevis>
>
>> On 1/31/12 3:47 PM, emmanuel segura wrote:
>>
>>> William try to follow the suggestion of Arnold
>>>
>>> In my case it's different because we don't use drbd we are using SAN
>>> with ocfs2
>>>
>>> But i think for drbd in dual primary you need the attribute
>>> master-max="2"
>>
>> I did, or thought I did. Have I missed something? Again, from "crm
>> configure show":
>>
>> primitive AdminDrbd ocf:linbit:drbd \
>> params drbd_resource="admin" \
>> op monitor interval="60s" role="Master" \
>> op monitor interval="59s" role="Slave" \
>> op stop interval="0" timeout="320" \
>> op start interval="0" timeout="240"
>>
>> ms AdminClone AdminDrbd \
>> meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"
>>
>> Still no promotion to primary on either node.
>>
>>>
>>> 2012/1/31 William Seligman <seligman [at] nevis>
>>>
>>>> On Tue, 31 Jan 2012 00:36:23 Arnold Krille wrote:
>>>>
>>>>> On Tuesday 31 January 2012 00:12:52 emmanuel segura wrote:
>>>>
>>>>>> But if you wanna implement dual primary i think you don't nee promote
>>>>>> for your drbd Try to use clone without master/slave
>>>>
>>>>> At least when you use the linbit-ra, using it without a master-clone
>> will
>>>>> give you one(!) slave only. When you use a normal clone with two
>> clones,
>>>>> you will get two slaves. The RA only goes primary on "promote", that is
>>>>> when its in master-state. => You need a master-clone of two clones with
>>>>> 1-2 masters to use drbd in the cluster.
>>>>
>>>> If I understand Emmanual's suggestion: The only way I know how to
>>>> implement this is to create a simple clone group with lsb::drbd instead
>>>> of Linbit's drbd resource, and put "become-primary-on" for both my
>>>> nodes in drbd.conf.
>>>>
>>>> This might work in the short term, but I think it's risky in the long
>>>> term. For example: Something goes wrong and node A stoniths node B. I
>>>> bring node B back up, disabling cman+pacemaker before I do so, and want
>>>> to re-sync node B's DRBD partition with A. If I'm stupid (occupational
>>>> hazard), I won't remember to edit drbd.conf before I do this, node B
>>>> will automatically try to become primary, and probably get stonith'ed
>>>> again.
>>>>
>>>> Arnold: I thought that was what I was doing with these statements:
>>>>
>>>> primitive AdminDrbd ocf:linbit:drbd \
>>>> params drbd_resource="admin" \
>>>> op monitor interval="60s" role="Master" \
>>>> op stop interval="0" timeout="320" \
>>>> op start interval="0" timeout="240"
>>>>
>>>> ms AdminClone AdminDrbd \
>>>> meta master-max="2" master-node-max="1" clone-max="2"
>>>> clone-node-max="1"
>>>>
>>>>
>>>> That is, master-max="2" means to promote two instances to master. Did I
>>>> get it wrong?

--
Bill Seligman | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
PO Box 137 |
Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
Attachments: smime.p7s (4.39 KB)


lars.ellenberg at linbit

Jan 31, 2012, 1:42 PM

Post #12 of 14 (1578 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Tue, Jan 31, 2012 at 04:26:44PM -0500, William Seligman wrote:
> On 1/31/12 4:11 PM, emmanuel segura wrote:
> > William can you try like this
> >
> > primitive AdminDrbd ocf:linbit:drbd \
> > params drbd_resource="admin" \
> > op monitor interval="60s" role="Master"
> >
> > clone Adming AdminDrbd
>
> Both Arnold and Lars said this wouldn't work. I just tried it. They were right.
>
> Is there anything at all to the log message:
>
> Jan 31 16:20:54 orestes-tb lrmd: [12231]: info: RA output:
> (AdminDrbd:1:monitor:stderr) Could not map uname=orestes-tb.nevis.columbia.edu
> to a UUID: The object/attribute does not exist

Hmmm.
That message comes from cib_utils.c.
probably crm_master, which is a wrapper
around crm_attribute.
"should not happen".

Looks like parts of the system do not agree wether to use
orestes-tb only, or orestes-tb.nevis.columbia.edu ...

And if the resource agent is unable to set a master score,
pacemaker will not even try to promote.

What does uname -n say?
Does it list the node name only, or the FQDN?

Does not really look like something DRBD could fix.

Try to get a "ocf:pacemaker:Stateful" dummy resource promoted,
if that works, come back with drbd specifics.


--
: Lars Ellenberg
: LINBIT | Your Way to High Availability
: DRBD/HA support and consulting http://www.linbit.com
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


seligman at nevis

Jan 31, 2012, 2:04 PM

Post #13 of 14 (1614 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On 1/31/12 4:42 PM, Lars Ellenberg wrote:
> On Tue, Jan 31, 2012 at 04:26:44PM -0500, William Seligman wrote:
>> On 1/31/12 4:11 PM, emmanuel segura wrote:
>>> William can you try like this
>>>
>>> primitive AdminDrbd ocf:linbit:drbd \
>>> params drbd_resource="admin" \
>>> op monitor interval="60s" role="Master"
>>>
>>> clone Adming AdminDrbd
>>
>> Both Arnold and Lars said this wouldn't work. I just tried it. They were right.
>>
>> Is there anything at all to the log message:
>>
>> Jan 31 16:20:54 orestes-tb lrmd: [12231]: info: RA output:
>> (AdminDrbd:1:monitor:stderr) Could not map uname=orestes-tb.nevis.columbia.edu
>> to a UUID: The object/attribute does not exist
>
> Hmmm.
> That message comes from cib_utils.c.
> probably crm_master, which is a wrapper
> around crm_attribute.
> "should not happen".
>
> Looks like parts of the system do not agree wether to use
> orestes-tb only, or orestes-tb.nevis.columbia.edu ...
>
> And if the resource agent is unable to set a master score,
> pacemaker will not even try to promote.
>
> What does uname -n say?
> Does it list the node name only, or the FQDN?

# uname -n
orestes-tb.nevis.columbia.edu

Aha! I went to /etc/cluster/cluster.conf, and changed all the host names to the
FQDN. It works!


Master/Slave Set: AdminClone [AdminDrbd]
Masters: [ hypatia-tb.nevis.columbia.edu orestes-tb.nevis.columbia.edu ]


Lars is the man! And I am a fool for not reading this web page closely enough:

<http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s02s02.html>

In the example, they just use the node name. But it clearly states to use the
output from 'uname -n' in cluster.conf. I guess on their Linux distro uname -n
returns just the node name.

Thanks!
--
Bill Seligman | Phone: (914) 591-2823
Nevis Labs, Columbia Univ | mailto://seligman [at] nevis
PO Box 137 |
Irvington NY 10533 USA | http://www.nevis.columbia.edu/~seligman/
Attachments: smime.p7s (4.39 KB)


arnold at arnoldarts

Jan 31, 2012, 3:01 PM

Post #14 of 14 (1564 views)
Permalink
Re: cman+pacemaker+dual-primary drbd does not promote [In reply to]

On Tuesday 31 January 2012 15:54:30 William Seligman wrote:
> I did, or thought I did. Have I missed something? Again, from "crm configure
> show":
>
> primitive AdminDrbd ocf:linbit:drbd \
> params drbd_resource="admin" \
> op monitor interval="60s" role="Master" \
> op monitor interval="59s" role="Slave" \
> op stop interval="0" timeout="320" \
> op start interval="0" timeout="240"
>
> ms AdminClone AdminDrbd \
> meta master-max="2" master-node-max="1" clone-max="2" clone-node-max="1"
>
> Still no promotion to primary on either node.

Maybe add a target-role="Master" to the meta-data of the ms-clone. Or have a
resource depend on a master and make that target-role="Started".

Otherwise I think there is no incentive for the resource to go master. "crm
configure ptest [nograph] scores" helps in that regard. Also when you change
dependencies of the resources, its a good tool to see the effects on the
running cluster before you hit "commit".

Have fun,

Arnold
Attachments: signature.asc (0.19 KB)

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.