Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

How to start failover, when RAS device is stopping

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


satoyoshi at intellilink

Nov 5, 2009, 12:15 AM

Post #1 of 7 (1093 views)
Permalink
How to start failover, when RAS device is stopping

Hi all,

I have two node cluster in active/passive mode with stonith.
STONITH plugin : ibmrsa-telnet

When the power supply of active node (with RSA device) was turned off, it don't STONITH.
So it don't failover. (the resource don't start)
However, I hope to start failover if the RSA device is stopping.
For example, The SSH plugin avoids this problem by the "livedangerously" parameter.

How can I implement it?
If you come up with a good idea, let me know.

Thanks in advance for any help.

Regards,
Yoshihiko SATO


_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


dejanmm at fastmail

Nov 5, 2009, 1:47 AM

Post #2 of 7 (1043 views)
Permalink
Re: How to start failover, when RAS device is stopping [In reply to]

Hi Yoshihiko-san,

On Thu, Nov 05, 2009 at 05:15:56PM +0900, Yoshihiko SATO wrote:
> Hi all,
>
> I have two node cluster in active/passive mode with stonith.
> STONITH plugin : ibmrsa-telnet
>
> When the power supply of active node (with RSA device) was
> turned off, it don't STONITH. So it don't failover. (the
> resource don't start) However, I hope to start failover if the
> RSA device is stopping. For example, The SSH plugin avoids
> this problem by the "livedangerously" parameter.

I thought that that name would scare anybody who cares about
their data :)

> How can I implement it?

We don't have such a thing. Somebody posted a while ago a link to
SGI's failsafe sources where they implemented a rather elaborate
scheme with which it was possible to use some heuristics to
figure out if the node is without power. It may work and it may
be a worthy addition to our fencing solution (providing that it's
turned off by default). Though I personally wouldn't trust such
witchcraft ;-)

Thanks,

Dejan

> If you come up with a good idea, let me know.
>
> Thanks in advance for any help.
>
> Regards,
> Yoshihiko SATO
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Nov 5, 2009, 2:14 AM

Post #3 of 7 (1040 views)
Permalink
Re: How to start failover, when RAS device is stopping [In reply to]

On Thu, Nov 5, 2009 at 10:47 AM, Dejan Muhamedagic <dejanmm [at] fastmail> wrote:
> Hi Yoshihiko-san,
>
> On Thu, Nov 05, 2009 at 05:15:56PM +0900, Yoshihiko SATO wrote:
>> Hi all,
>>
>> I have two node cluster in active/passive mode with stonith.
>> STONITH plugin : ibmrsa-telnet
>>
>> When the power supply of active node (with RSA device) was
>> turned off, it don't STONITH.  So it don't failover. (the
>> resource don't start) However, I hope to start failover if the
>> RSA device is stopping.  For example, The SSH plugin avoids
>> this problem by the "livedangerously" parameter.
>
> I thought that that name would scare anybody who cares about
> their data :)
>
>> How can I implement it?
>
> We don't have such a thing. Somebody posted a while ago a link to
> SGI's failsafe sources where they implemented a rather elaborate
> scheme with which it was possible to use some heuristics to
> figure out if the node is without power. It may work and it may
> be a worthy addition to our fencing solution (providing that it's
> turned off by default). Though I personally wouldn't trust such
> witchcraft ;-)

Agreed. I would never use it personally, but I've no objection to it
being added if a) someone else writes the patch and b) its disabled by
default :-)
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


satoyoshi at intellilink

Nov 6, 2009, 1:18 AM

Post #4 of 7 (1027 views)
Permalink
Re: How to start failover, when RAS device is stopping [In reply to]

Hi,

Thanks reply.

How about the check with ping in ibmrsa-telnet?
Of course, it is invalid in default.
It is the same as SSH plugin.

is it no good?
though I understand that it is not enough...

I would like to hear any opinion.

Regards,
Yoshihiko SATO

> On Thu, Nov 5, 2009 at 10:47 AM, Dejan Muhamedagic <dejanmm [at] fastmail> wrote:
>> Hi Yoshihiko-san,
>>
>> On Thu, Nov 05, 2009 at 05:15:56PM +0900, Yoshihiko SATO wrote:
>>> Hi all,
>>>
>>> I have two node cluster in active/passive mode with stonith.
>>> STONITH plugin : ibmrsa-telnet
>>>
>>> When the power supply of active node (with RSA device) was
>>> turned off, it don't STONITH. So it don't failover. (the
>>> resource don't start) However, I hope to start failover if the
>>> RSA device is stopping. For example, The SSH plugin avoids
>>> this problem by the "livedangerously" parameter.
>> I thought that that name would scare anybody who cares about
>> their data :)
>>
>>> How can I implement it?
>> We don't have such a thing. Somebody posted a while ago a link to
>> SGI's failsafe sources where they implemented a rather elaborate
>> scheme with which it was possible to use some heuristics to
>> figure out if the node is without power. It may work and it may
>> be a worthy addition to our fencing solution (providing that it's
>> turned off by default). Though I personally wouldn't trust such
>> witchcraft ;-)
>
> Agreed. I would never use it personally, but I've no objection to it
> being added if a) someone else writes the patch and b) its disabled by
> default :-)
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Nov 6, 2009, 4:02 AM

Post #5 of 7 (1024 views)
Permalink
Re: How to start failover, when RAS device is stopping [In reply to]

On Thu, Nov 5, 2009 at 10:47 AM, Dejan Muhamedagic <dejanmm [at] fastmail> wrote:
> Hi Yoshihiko-san,
>
> On Thu, Nov 05, 2009 at 05:15:56PM +0900, Yoshihiko SATO wrote:
>> Hi all,
>>
>> I have two node cluster in active/passive mode with stonith.
>> STONITH plugin : ibmrsa-telnet
>>
>> When the power supply of active node (with RSA device) was
>> turned off, it don't STONITH.  So it don't failover. (the
>> resource don't start) However, I hope to start failover if the
>> RSA device is stopping.  For example, The SSH plugin avoids
>> this problem by the "livedangerously" parameter.
>
> I thought that that name would scare anybody who cares about
> their data :)
>
>> How can I implement it?
>
> We don't have such a thing. Somebody posted a while ago a link to
> SGI's failsafe sources where they implemented a rather elaborate
> scheme with which it was possible to use some heuristics to
> figure out if the node is without power. It may work and it may
> be a worthy addition to our fencing solution (providing that it's
> turned off by default). Though I personally wouldn't trust such
> witchcraft ;-)

The relevant failsafe code, in case anyone wants to port it, is
http://oss.sgi.com/cgi-bin/cvsweb.cgi/failsafe/FailSafe/cluster_services/cmd/crs/crsd_misc.c?annotate=1.1
(read from line 139)
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


dejanmm at fastmail

Nov 6, 2009, 5:08 AM

Post #6 of 7 (1010 views)
Permalink
Re: How to start failover, when RAS device is stopping [In reply to]

Hi,

On Fri, Nov 06, 2009 at 06:18:50PM +0900, Yoshihiko SATO wrote:
> Hi,
>
> Thanks reply.
>
> How about the check with ping in ibmrsa-telnet?

No, sorry, results of ping are not considered to reliably
determine the state.

> Of course, it is invalid in default.
> It is the same as SSH plugin.

Yes, it's just that ssh is used only for testing and the ping
part got introduced to make testing faster.

> is it no good?

Basically, no, unless testing. Even here it's not enabled by
default.

Thanks,

Dejan

> though I understand that it is not enough...
>
> I would like to hear any opinion.
>
> Regards,
> Yoshihiko SATO
>
> > On Thu, Nov 5, 2009 at 10:47 AM, Dejan Muhamedagic <dejanmm [at] fastmail> wrote:
> >> Hi Yoshihiko-san,
> >>
> >> On Thu, Nov 05, 2009 at 05:15:56PM +0900, Yoshihiko SATO wrote:
> >>> Hi all,
> >>>
> >>> I have two node cluster in active/passive mode with stonith.
> >>> STONITH plugin : ibmrsa-telnet
> >>>
> >>> When the power supply of active node (with RSA device) was
> >>> turned off, it don't STONITH. So it don't failover. (the
> >>> resource don't start) However, I hope to start failover if the
> >>> RSA device is stopping. For example, The SSH plugin avoids
> >>> this problem by the "livedangerously" parameter.
> >> I thought that that name would scare anybody who cares about
> >> their data :)
> >>
> >>> How can I implement it?
> >> We don't have such a thing. Somebody posted a while ago a link to
> >> SGI's failsafe sources where they implemented a rather elaborate
> >> scheme with which it was possible to use some heuristics to
> >> figure out if the node is without power. It may work and it may
> >> be a worthy addition to our fencing solution (providing that it's
> >> turned off by default). Though I personally wouldn't trust such
> >> witchcraft ;-)
> >
> > Agreed. I would never use it personally, but I've no objection to it
> > being added if a) someone else writes the patch and b) its disabled by
> > default :-)
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


satoyoshi at intellilink

Nov 8, 2009, 11:19 PM

Post #7 of 7 (986 views)
Permalink
Re: How to start failover, when RAS device is stopping [In reply to]

Hi, Dejan and Andrew

Thank you for opinions.
I would like to refer to your opinion.

Thanks,

Yoshihiko SATO

> Hi,
>
> On Fri, Nov 06, 2009 at 06:18:50PM +0900, Yoshihiko SATO wrote:
>> Hi,
>>
>> Thanks reply.
>>
>> How about the check with ping in ibmrsa-telnet?
>
> No, sorry, results of ping are not considered to reliably
> determine the state.
>
>> Of course, it is invalid in default.
>> It is the same as SSH plugin.
>
> Yes, it's just that ssh is used only for testing and the ping
> part got introduced to make testing faster.
>
>> is it no good?
>
> Basically, no, unless testing. Even here it's not enabled by
> default.
>
> Thanks,
>
> Dejan
>
>> though I understand that it is not enough...
>>
>> I would like to hear any opinion.
>>
>> Regards,
>> Yoshihiko SATO
>>
>>> On Thu, Nov 5, 2009 at 10:47 AM, Dejan Muhamedagic <dejanmm [at] fastmail> wrote:
>>>> Hi Yoshihiko-san,
>>>>
>>>> On Thu, Nov 05, 2009 at 05:15:56PM +0900, Yoshihiko SATO wrote:
>>>>> Hi all,
>>>>>
>>>>> I have two node cluster in active/passive mode with stonith.
>>>>> STONITH plugin : ibmrsa-telnet
>>>>>
>>>>> When the power supply of active node (with RSA device) was
>>>>> turned off, it don't STONITH. So it don't failover. (the
>>>>> resource don't start) However, I hope to start failover if the
>>>>> RSA device is stopping. For example, The SSH plugin avoids
>>>>> this problem by the "livedangerously" parameter.
>>>> I thought that that name would scare anybody who cares about
>>>> their data :)
>>>>
>>>>> How can I implement it?
>>>> We don't have such a thing. Somebody posted a while ago a link to
>>>> SGI's failsafe sources where they implemented a rather elaborate
>>>> scheme with which it was possible to use some heuristics to
>>>> figure out if the node is without power. It may work and it may
>>>> be a worthy addition to our fencing solution (providing that it's
>>>> turned off by default). Though I personally wouldn't trust such
>>>> witchcraft ;-)
>>> Agreed. I would never use it personally, but I've no objection to it
>>> being added if a) someone else writes the patch and b) its disabled by
>>> default :-)
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems

_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.