Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Dev

STONITH troubles w/ IPMI on 2.99

 

 

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded


zack at gilburd

Sep 5, 2008, 4:12 PM

Post #1 of 14 (2167 views)
Permalink
STONITH troubles w/ IPMI on 2.99

All,

I figure this is probably more of a -users question, but I figured
since I am talking about the devel branch (latest 2.99) it might
belong here. If not, please let me know.

I have configured my two node cluster to use STONITH via ipmi over
lan. From what the logs tell me after a kill -9, HA recognizes that
it should use IPMI and attempts to do so only to have an API failure.

Here is what I can tell from the HA log:

stonithd[5196]: 2008/09/05_16:02:03 info: client tengine [pid: 25632]
requests a STONITH operation RESET on node ctrl2
stonithd[5196]: 2008/09/05_16:02:03 info: we can't manage ctrl2,
broadcast request to other nodes
stonithd[5196]: 2008/09/05_16:02:03 ERROR: send_ha_message: Not
connected to Heartbeat
stonithd[5196]: 2008/09/05_16:02:03 WARN: crm_log_message_adv:
#========= HA[outbound] message start ==========#
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG: Dumping message with 6
fields
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG[0] : [node=ctrl2]
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG[1] : [optype=1]
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG[2] : [timeout=30000]
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG[3] : [callid=-10]
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG[4] : [t=stit]
stonithd[5196]: 2008/09/05_16:02:03 WARN: MSG[5] : [src=ctrl1]
stonithd[5196]: 2008/09/05_16:02:03 ERROR: require_others_to_stonith
failed.
tengine[25632]: 2008/09/05_16:02:03 ERROR: stonithd_node_fence:567:
Stonithd's synchronous answer is ST_APIFAIL
tengine[25632]: 2008/09/05_16:02:03 ERROR: te_fence_node: Cannot fence
ctrl2: stonithd_node_fence() call failed

Here's what I've got in my CIB:

<resources>
<group id="stonith_group">
<meta_attributes id="stonith_group_meta_attrs">
<attributes>
<nvpair id="stonith_group_metaattr_target_role"
name="target_role" value="stopped"/>
</attributes>
</meta_attributes>
<primitive id="ctrl1-fence" class="stonith" type="external/ipmi"
provider="heartbeat">
<instance_attributes id="ctrl1-fence_instance_attrs">
<attributes>
<nvpair id="f4511eb5-0c0c-4178-ae9c-d8e44e6a55db"
name="hostname" value="ctrl1"/>
<nvpair id="858efbcd-744e-430d-89c7-a0cbfb9a588f"
name="ipaddr" value="192.168.1.200"/>
<nvpair id="ccd75a7b-86d1-4745-ac74-2f004088e46a"
name="userid" value="ME"/>
<nvpair id="d1ff940b-7a7a-455d-818f-804fcc2ff3c1"
name="passwd" value="BLAHBLAH"/>
<nvpair id="882b7923-3fe1-4224-bf55-0102881994c9"
name="interface" value="lan"/>
</attributes>
</instance_attributes>
<meta_attributes id="ctrl1-fence_meta_attrs">
<attributes>
<nvpair id="ctrl1-fence_metaattr_target_role"
name="target_role" value="started"/>
</attributes>
</meta_attributes>
</primitive>
<primitive id="ctrl2-fence" class="stonith" type="external/ipmi"
provider="heartbeat">
<instance_attributes id="ctrl2-fence_instance_attrs">
<attributes>
<nvpair id="b9b77246-562e-4a3b-b411-b54d7e35e539"
name="hostname" value="ctrl2"/>
<nvpair id="57c06cfe-fc5c-4127-906a-a1af595624c7"
name="ipaddr" value="192.168.1.201"/>
<nvpair id="1dfead45-7e1d-404f-b580-a2fd2f1b37fc"
name="userid" value="ME"/>
<nvpair id="976815ca-7819-4fa7-a621-df3765828b25"
name="passwd" value="BLAHBLAH"/>
<nvpair id="a8e8c35a-638b-48f3-9d11-769ddfecaaa5"
name="interface" value="lan"/>
</attributes>
</instance_attributes>
<meta_attributes id="ctrl2-fence_meta_attrs">
<attributes>
<nvpair id="ctrl2-fence_metaattr_target_role"
name="target_role" value="started"/>
</attributes>
</meta_attributes>
</primitive>
</group>
</resources>
<constraints>
<rsc_location id="ctrl1-fence-placement" rsc="ctrl1-fence">
<rule id="prefered_ctrl1-fence-placement" score="-INFINITY">
<expression attribute="#uname"
id="47962236-99d0-4942-8197-5e81f8c6604d" operation="ne" value="ctrl1"/>
</rule>
</rsc_location>
<rsc_location id="ctrl2-fence-placement" rsc="ctrl2-fence">
<rule id="prefered_ctrl2-fence-placement" score="-INFINITY">
<expression attribute="#uname"
id="37f7f6e3-6ecd-440e-83b0-375f9e5bfc60" operation="ne" value="ctrl2"/>
</rule>
</rsc_location>
</constraints>

Obviously I did significant clipping from my cib.

I am interested to hear everyone's thoughts.

Thanks :)

Zack
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


lmb at suse

Sep 8, 2008, 4:29 AM

Post #2 of 14 (2085 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

On 2008-09-05T16:12:45, Zack Gilburd <zack [at] gilburd> wrote:

> All,
>
> I figure this is probably more of a -users question, but I figured since I
> am talking about the devel branch (latest 2.99) it might belong here. If
> not, please let me know.

stonithd is not part of 2.99.x - what are you running?


Regards,
Lars

--
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


zack at gilburd

Sep 8, 2008, 7:59 AM

Post #3 of 14 (2089 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Lars Marowsky-Bree wrote:
> On 2008-09-05T16:12:45, Zack Gilburd <zack [at] gilburd> wrote:
>
>
>> All,
>>
>> I figure this is probably more of a -users question, but I figured since I
>> am talking about the devel branch (latest 2.99) it might belong here. If
>> not, please let me know.
>>
>
> stonithd is not part of 2.99.x - what are you running?
>
>
> Regards,
> Lars
>
>
Lars,

That could be a result of installing 2.1.3, removing it, and then
installing 2.99.x on top of that.

root [at] ctrl:~# dpkg -l | grep stonith
ii libstonith0 2.99.0-1
Interface for remotely powering down a node
ii stonith 2.99.0-1
Interface for remotely powering down a node

Is this what you meant? If not, I'm not sure I follow you as to what
the problem is.

Thanks!

Zack
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


dejanmm at fastmail

Sep 8, 2008, 9:58 AM

Post #4 of 14 (2094 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Hi,

On Mon, Sep 08, 2008 at 07:59:50AM -0700, Zack Gilburd wrote:
> Lars Marowsky-Bree wrote:
>> On 2008-09-05T16:12:45, Zack Gilburd <zack [at] gilburd> wrote:
>>
>>
>>> All,
>>>
>>> I figure this is probably more of a -users question, but I figured since
>>> I am talking about the devel branch (latest 2.99) it might belong here.
>>> If not, please let me know.
>>>
>>
>> stonithd is not part of 2.99.x - what are you running?
>>
>>
>> Regards,
>> Lars
>>
>>
> Lars,
>
> That could be a result of installing 2.1.3, removing it, and then
> installing 2.99.x on top of that.
> root [at] ctrl:~# dpkg -l | grep stonith
> ii libstonith0 2.99.0-1
> Interface for remotely powering down a node
> ii stonith 2.99.0-1
> Interface for remotely powering down a node
>
> Is this what you meant? If not, I'm not sure I follow you as to what the
> problem is.

stonith is part of 2.99, but stonithd (which logged those
messages) is not. You need to install the latest stable
pacemaker.

Thanks,

Dejan

> Thanks!
>
> Zack
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


zack at gilburd

Sep 9, 2008, 12:46 AM

Post #5 of 14 (2082 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Dejan Muhamedagic wrote:
> Hi,
>
> On Mon, Sep 08, 2008 at 07:59:50AM -0700, Zack Gilburd wrote:
>
>> Lars Marowsky-Bree wrote:
>>
>>> On 2008-09-05T16:12:45, Zack Gilburd <zack [at] gilburd> wrote:
>>>
>>>
>>>
>>>> All,
>>>>
>>>> I figure this is probably more of a -users question, but I figured since
>>>> I am talking about the devel branch (latest 2.99) it might belong here.
>>>> If not, please let me know.
>>>>
>>>>
>>> stonithd is not part of 2.99.x - what are you running?
>>>
>>>
>>> Regards,
>>> Lars
>>>
>>>
>>>
>> Lars,
>>
>> That could be a result of installing 2.1.3, removing it, and then
>> installing 2.99.x on top of that.
>> root [at] ctrl:~# dpkg -l | grep stonith
>> ii libstonith0 2.99.0-1
>> Interface for remotely powering down a node
>> ii stonith 2.99.0-1
>> Interface for remotely powering down a node
>>
>> Is this what you meant? If not, I'm not sure I follow you as to what the
>> problem is.
>>
>
> stonith is part of 2.99, but stonithd (which logged those
> messages) is not. You need to install the latest stable
> pacemaker.
>
> Thanks,
>
> Dejan
>
>
>> Thanks!
>>
>> Zack
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>

I have the latest (0.6.6-1) pacemaker installed. From what I can tell,
the plugins that I am using in those rules are from the heartbeat-2.99
package debian package.

root [at] ctrl:~# dlocate stonith
heartbeat: /usr/lib/stonith/plugins/external/ipmi
heartbeat: /usr/lib/stonith/plugins/stonith2/ipmilan.so

I have tried using both ipmilan and external/ipmi without any luck.
With either, I still get the stonithd messages. I am wondering if this
is somehow due to the fact that I initially set this cluster up without
pacemaker. I am not quite sure how to proceed.

Cheers,

Zack
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


lmb at suse

Sep 9, 2008, 3:19 PM

Post #6 of 14 (2069 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

On 2008-09-09T00:46:27, Zack Gilburd <zack [at] gilburd> wrote:

> I have the latest (0.6.6-1) pacemaker installed. From what I can tell, the
> plugins that I am using in those rules are from the heartbeat-2.99 package
> debian package.

Right. The point was that your reference to "heartbeat 2.99.x" simply
was incomplete, and you need to specify the full set of versions.

> I have tried using both ipmilan and external/ipmi without any luck. With
> either, I still get the stonithd messages. I am wondering if this is
> somehow due to the fact that I initially set this cluster up without
> pacemaker. I am not quite sure how to proceed.

That is likely unrelated.

Please try if your configuration works with "stonith" on the
commandline first.


Regards,
Lars

--
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


dejanmm at fastmail

Sep 9, 2008, 4:00 PM

Post #7 of 14 (2069 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Hi,

On Tue, Sep 09, 2008 at 12:46:27AM -0700, Zack Gilburd wrote:
> Dejan Muhamedagic wrote:
>> Hi,
>>
>> On Mon, Sep 08, 2008 at 07:59:50AM -0700, Zack Gilburd wrote:
>>
>>> Lars Marowsky-Bree wrote:
>>>
>>>> On 2008-09-05T16:12:45, Zack Gilburd <zack [at] gilburd> wrote:
>>>>
>>>>
>>>>> All,
>>>>>
>>>>> I figure this is probably more of a -users question, but I figured
>>>>> since I am talking about the devel branch (latest 2.99) it might belong
>>>>> here. If not, please let me know.
>>>>>
>>>> stonithd is not part of 2.99.x - what are you running?
>>>>
>>>>
>>>> Regards,
>>>> Lars
>>>>
>>>>
>>> Lars,
>>>
>>> That could be a result of installing 2.1.3, removing it, and then
>>> installing 2.99.x on top of that. root [at] ctrl:~# dpkg -l | grep stonith
>>> ii libstonith0 2.99.0-1
>>> Interface for remotely powering down a node
>>> ii stonith 2.99.0-1
>>> Interface for remotely powering down a node
>>>
>>> Is this what you meant? If not, I'm not sure I follow you as to what the
>>> problem is.
>>>
>>
>> stonith is part of 2.99, but stonithd (which logged those
>> messages) is not. You need to install the latest stable
>> pacemaker.
>>
>> Thanks,
>>
>> Dejan
>>
>>
>>> Thanks!
>>>
>>> Zack
>>> _______________________________________________________
>>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>> Home Page: http://linux-ha.org/
>>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>
> I have the latest (0.6.6-1) pacemaker installed. From what I can tell, the
> plugins that I am using in those rules are from the heartbeat-2.99 package
> debian package.
> root [at] ctrl:~# dlocate stonith
> heartbeat: /usr/lib/stonith/plugins/external/ipmi
> heartbeat: /usr/lib/stonith/plugins/stonith2/ipmilan.so

Right. For whatever reason, the stonith and libstonith debian
packages are empty.

> I have tried using both ipmilan and external/ipmi without any luck. With
> either, I still get the stonithd messages. I am wondering if this is
> somehow due to the fact that I initially set this cluster up without
> pacemaker. I am not quite sure how to proceed.

I really can't say. Tried the very same combination of packages
and stonith worked. Not with ipmi, but according to the error
message you posted the problem seems to be with stonithd
connecting to heartbeat. Perhaps you could check permissions on
/var/lib/heartbeat and /var/run/heartbeat. You can also try to
connect with strace to stonithd to see what exactly is it trying
to do.

Thanks,

Dejan

> Cheers,
>
> Zack
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


zack at gilburd

Sep 9, 2008, 5:18 PM

Post #8 of 14 (2069 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Gentlemen,

Thank you for your replies.

I have learned that "stonith" does indeed work from the command line using
ipmitool and the parameters that I have defined in the CIB.

I have turned up debugging to level 2 and these are the error messages I
am receiving.

stonithd[5467]: 2008/09/09_17:12:27 ERROR: Failed to STONITH the node
ctrl2: optype=RESET, op_result=TIMEOUT
tengine[6157]: 2008/09/09_17:12:27 info: tengine_stonith_callback:
call=-6, optype=1, node_name=ctrl2, result=2, node_list=,
action=48:4:0:7da22a2f-311e-42ec-9f25-c7149dea2e1f
tengine[6157]: 2008/09/09_17:12:27 ERROR: tengine_stonith_callback:
Stonith of ctrl2 failed (2)... aborting transition.

I am thinking that it could be either:
1) stonithd doesn't "see" the stonith configuration in the cib
2) stonithd can't connect to heartbeat (in which case, how would stonithd
know that the node is unexpectedly down?)

I will continue to fiddle around until I hear back from those more
knowledgeable than myself.

Thanks again,

Zack

> Hi,
>
> On Tue, Sep 09, 2008 at 12:46:27AM -0700, Zack Gilburd wrote:
>> Dejan Muhamedagic wrote:
>>> Hi,
>>>
>>> On Mon, Sep 08, 2008 at 07:59:50AM -0700, Zack Gilburd wrote:
>>>
>>>> Lars Marowsky-Bree wrote:
>>>>
>>>>> On 2008-09-05T16:12:45, Zack Gilburd <zack [at] gilburd> wrote:
>>>>>
>>>>>
>>>>>> All,
>>>>>>
>>>>>> I figure this is probably more of a -users question, but I figured
>>>>>> since I am talking about the devel branch (latest 2.99) it might
>>>>>> belong
>>>>>> here. If not, please let me know.
>>>>>>
>>>>> stonithd is not part of 2.99.x - what are you running?
>>>>>
>>>>>
>>>>> Regards,
>>>>> Lars
>>>>>
>>>>>
>>>> Lars,
>>>>
>>>> That could be a result of installing 2.1.3, removing it, and then
>>>> installing 2.99.x on top of that. root [at] ctrl:~# dpkg -l | grep stonith
>>>> ii libstonith0 2.99.0-1
>>>> Interface for remotely powering down a node
>>>> ii stonith 2.99.0-1
>>>> Interface for remotely powering down a node
>>>>
>>>> Is this what you meant? If not, I'm not sure I follow you as to what
>>>> the
>>>> problem is.
>>>>
>>>
>>> stonith is part of 2.99, but stonithd (which logged those
>>> messages) is not. You need to install the latest stable
>>> pacemaker.
>>>
>>> Thanks,
>>>
>>> Dejan
>>>
>>>
>>>> Thanks!
>>>>
>>>> Zack
>>>> _______________________________________________________
>>>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>>> Home Page: http://linux-ha.org/
>>>>
>>> _______________________________________________________
>>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>>> Home Page: http://linux-ha.org/
>>>
>>
>> I have the latest (0.6.6-1) pacemaker installed. From what I can tell,
>> the
>> plugins that I am using in those rules are from the heartbeat-2.99
>> package
>> debian package.
>> root [at] ctrl:~# dlocate stonith
>> heartbeat: /usr/lib/stonith/plugins/external/ipmi
>> heartbeat: /usr/lib/stonith/plugins/stonith2/ipmilan.so
>
> Right. For whatever reason, the stonith and libstonith debian
> packages are empty.
>
>> I have tried using both ipmilan and external/ipmi without any luck.
>> With
>> either, I still get the stonithd messages. I am wondering if this is
>> somehow due to the fact that I initially set this cluster up without
>> pacemaker. I am not quite sure how to proceed.
>
> I really can't say. Tried the very same combination of packages
> and stonith worked. Not with ipmi, but according to the error
> message you posted the problem seems to be with stonithd
> connecting to heartbeat. Perhaps you could check permissions on
> /var/lib/heartbeat and /var/run/heartbeat. You can also try to
> connect with strace to stonithd to see what exactly is it trying
> to do.
>
> Thanks,
>
> Dejan
>
>> Cheers,
>>
>> Zack
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


jli at greshamstorage

Sep 11, 2008, 8:46 AM

Post #9 of 14 (2057 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Zack Gilburd wrote:
> I am thinking that it could be either:
> 1) stonithd doesn't "see" the stonith configuration in the cib

>
> I will continue to fiddle around until I hear back from those more
> knowledgeable than myself.

One of my co-workers claims was having problems with multiple stonith entries in his cib (different heartbeat/stonith
version). I haven't personally investigated, but you might just try one stonith entry and see if that helps. In his case
he just switched to using a single stonith entry that supports all the hosts in the cluster.

If that turns out to be the problem, I have a hacked version of the ipmi script which supports multiple hosts from a
single stonith entry.

This brings up a question of mine, if you have a stonith plug-in which supports multiple hosts, is it better to use it
in a multiple host configuration, or have a separate cib entry for each host? Does it matter?


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


zack at gilburd

Sep 12, 2008, 8:33 AM

Post #10 of 14 (2047 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Jeremy,

Are you suggesting that I put the entries in separate groups? I don't
see how I could use one entry to stonith either node selectively
considering that their BMC boards listen on separate IPs. I can't think
of a good way to differentiate the hosts in that case.

I would be interested in seeing your script, though. I am open to
trying anything, at this point. Because this cluster in particular is
serving IETD (and ietd hates to be stopped while in use), STONITH is a must.

One odd thing about this that is worth noting is that I have no trouble
stonith'ing the node from the command line using the stonith command.
The resource, however, is not doing it whatsoever. I am about to try
some SSH stonith to see if that works as a proof of concept. I do not
want to use SSH in production though because of various reasons, most
notably that external/ipmi would allow me to control the power via the
board management controller and not some userspace daemon.

Cheers,

Zack

Jeremy Linton wrote:
> Zack Gilburd wrote:
>> I am thinking that it could be either:
>> 1) stonithd doesn't "see" the stonith configuration in the cib
>
>>
>> I will continue to fiddle around until I hear back from those more
>> knowledgeable than myself.
> One of my co-workers claims was having problems with multiple
> stonith entries in his cib (different heartbeat/stonith version). I
> haven't personally investigated, but you might just try one stonith
> entry and see if that helps. In his case he just switched to using a
> single stonith entry that supports all the hosts in the cluster.
>
> If that turns out to be the problem, I have a hacked version of
> the ipmi script which supports multiple hosts from a single stonith
> entry.
>
> This brings up a question of mine, if you have a stonith plug-in
> which supports multiple hosts, is it better to use it in a multiple
> host configuration, or have a separate cib entry for each host? Does
> it matter?
>
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


jli at greshamstorage

Sep 12, 2008, 10:08 AM

Post #11 of 14 (2059 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Zack Gilburd wrote:
> Are you suggesting that I put the entries in separate groups? I don't
No, just one line, match your username/passwords on the IPMI interfaces and then use a single stonith stanza with the
hostnames and ipaddrs like:

<nvpair id="f4511eb5-0c0c-4178-ae9c-d8e44e6a55db" name="hostname" value="ctrl1,ctrl2,ctrl3"/>
<nvpair id="858efbcd-744e-430d-89c7-a0cbfb9a588f" name="ipaddr" value="192.168.1.200,192.168.1.201,192.168.1.202"/>

Make sure the host names and IP addresses are in the same order (aka 201 is for ctrl2).



> considering that their BMC boards listen on separate IPs. I can't think
> of a good way to differentiate the hosts in that case.
Stonith passes the hostname of the device to be managed to the script. Its up to the script to select the correct one
to actually power cycle. The "dumber" devices like the ipmi script need to have the hostnames passed in for comparison
purposes. Hence the hostname list, I guess in theory you can get rid of both the hostname and ipaddr parameters if your
willing to provide DNS entries of the form hostname-IPMI or similar (this is what the riloe script does). That way when
stonith says reset ctrl1, the script knows to talk to ctrl1-ipmi.


> I would be interested in seeing your script, though. I am open to
> trying anything, at this point. Because this cluster in particular is
I attached it. You must use it for the syntax I pointed out above to work.
Attachments: ipmi (4.52 KB)


zack at gilburd

Sep 12, 2008, 1:51 PM

Post #12 of 14 (2047 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Jeremy,

Excellent, thank you.

So I gather then that you placed no constraints on this resource, correct?

Cheers,
Z

> Zack Gilburd wrote:
>> Are you suggesting that I put the entries in separate groups? I don't
> No, just one line, match your username/passwords on the IPMI interfaces
> and then use a single stonith stanza with the
> hostnames and ipaddrs like:
>
> <nvpair id="f4511eb5-0c0c-4178-ae9c-d8e44e6a55db" name="hostname"
> value="ctrl1,ctrl2,ctrl3"/>
> <nvpair id="858efbcd-744e-430d-89c7-a0cbfb9a588f" name="ipaddr"
> value="192.168.1.200,192.168.1.201,192.168.1.202"/>
>
> Make sure the host names and IP addresses are in the same order (aka 201
> is for ctrl2).
>
>
>
>> considering that their BMC boards listen on separate IPs. I can't think
>> of a good way to differentiate the hosts in that case.
> Stonith passes the hostname of the device to be managed to the script.
> Its up to the script to select the correct one
> to actually power cycle. The "dumber" devices like the ipmi script need to
> have the hostnames passed in for comparison
> purposes. Hence the hostname list, I guess in theory you can get rid of
> both the hostname and ipaddr parameters if your
> willing to provide DNS entries of the form hostname-IPMI or similar (this
> is what the riloe script does). That way when
> stonith says reset ctrl1, the script knows to talk to ctrl1-ipmi.
>
>
>> I would be interested in seeing your script, though. I am open to
>> trying anything, at this point. Because this cluster in particular is
> I attached it. You must use it for the syntax I pointed out above to work.
>
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


zack at gilburd

Sep 12, 2008, 7:14 PM

Post #13 of 14 (2038 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Oddly enough, I am still getting timeouts every time as per my original
message in the thread. I can successfully stonith the node from the
command line using the script you gave me, however.

Perhaps this is a bug afterall.

> Jeremy,
>
> Excellent, thank you.
>
> So I gather then that you placed no constraints on this resource, correct?
>
> Cheers,
> Z
>
>> Zack Gilburd wrote:
>>> Are you suggesting that I put the entries in separate groups? I don't
>> No, just one line, match your username/passwords on the IPMI interfaces
>> and then use a single stonith stanza with the
>> hostnames and ipaddrs like:
>>
>> <nvpair id="f4511eb5-0c0c-4178-ae9c-d8e44e6a55db" name="hostname"
>> value="ctrl1,ctrl2,ctrl3"/>
>> <nvpair id="858efbcd-744e-430d-89c7-a0cbfb9a588f" name="ipaddr"
>> value="192.168.1.200,192.168.1.201,192.168.1.202"/>
>>
>> Make sure the host names and IP addresses are in the same order (aka 201
>> is for ctrl2).
>>
>>
>>
>>> considering that their BMC boards listen on separate IPs. I can't
>>> think
>>> of a good way to differentiate the hosts in that case.
>> Stonith passes the hostname of the device to be managed to the script.
>> Its up to the script to select the correct one
>> to actually power cycle. The "dumber" devices like the ipmi script need
>> to
>> have the hostnames passed in for comparison
>> purposes. Hence the hostname list, I guess in theory you can get rid of
>> both the hostname and ipaddr parameters if your
>> willing to provide DNS entries of the form hostname-IPMI or similar
>> (this
>> is what the riloe script does). That way when
>> stonith says reset ctrl1, the script knows to talk to ctrl1-ipmi.
>>
>>
>>> I would be interested in seeing your script, though. I am open to
>>> trying anything, at this point. Because this cluster in particular is
>> I attached it. You must use it for the syntax I pointed out above to
>> work.
>>
>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>


_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


dejanmm at fastmail

Sep 19, 2008, 8:00 AM

Post #14 of 14 (1930 views)
Permalink
Re: STONITH troubles w/ IPMI on 2.99 [In reply to]

Hi,

On Fri, Sep 12, 2008 at 07:14:45PM -0700, Zack Gilburd wrote:
> Oddly enough, I am still getting timeouts every time as per my original
> message in the thread. I can successfully stonith the node from the
> command line using the script you gave me, however.

Normally, that really shouldn't happen. If the stonith
configuration is the same as what you give on the command line,
then stonithd should carry out the fencing operations in the same
timeframe. You can try to add "debug 1" to the configuration and
post logs somewhere and I can take a look at them.

> Perhaps this is a bug afterall.

Of course, there's always a possibility. If you believe that it's
a bug please open a bugzilla with all relevant information and
preferable using hb_report to generate a report.

Thanks,

Dejan

> > Jeremy,
> >
> > Excellent, thank you.
> >
> > So I gather then that you placed no constraints on this resource, correct?
> >
> > Cheers,
> > Z
> >
> >> Zack Gilburd wrote:
> >>> Are you suggesting that I put the entries in separate groups? I don't
> >> No, just one line, match your username/passwords on the IPMI interfaces
> >> and then use a single stonith stanza with the
> >> hostnames and ipaddrs like:
> >>
> >> <nvpair id="f4511eb5-0c0c-4178-ae9c-d8e44e6a55db" name="hostname"
> >> value="ctrl1,ctrl2,ctrl3"/>
> >> <nvpair id="858efbcd-744e-430d-89c7-a0cbfb9a588f" name="ipaddr"
> >> value="192.168.1.200,192.168.1.201,192.168.1.202"/>
> >>
> >> Make sure the host names and IP addresses are in the same order (aka 201
> >> is for ctrl2).
> >>
> >>
> >>
> >>> considering that their BMC boards listen on separate IPs. I can't
> >>> think
> >>> of a good way to differentiate the hosts in that case.
> >> Stonith passes the hostname of the device to be managed to the script.
> >> Its up to the script to select the correct one
> >> to actually power cycle. The "dumber" devices like the ipmi script need
> >> to
> >> have the hostnames passed in for comparison
> >> purposes. Hence the hostname list, I guess in theory you can get rid of
> >> both the hostname and ipaddr parameters if your
> >> willing to provide DNS entries of the form hostname-IPMI or similar
> >> (this
> >> is what the riloe script does). That way when
> >> stonith says reset ctrl1, the script knows to talk to ctrl1-ipmi.
> >>
> >>
> >>> I would be interested in seeing your script, though. I am open to
> >>> trying anything, at this point. Because this cluster in particular is
> >> I attached it. You must use it for the syntax I pointed out above to
> >> work.
> >>
> >>
> >> _______________________________________________________
> >> Linux-HA-Dev: Linux-HA-Dev [at] lists
> >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> >> Home Page: http://linux-ha.org/
> >>
> >
> >
> > _______________________________________________________
> > Linux-HA-Dev: Linux-HA-Dev [at] lists
> > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> > Home Page: http://linux-ha.org/
> >
>
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.