Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Dev

About movement of the Quorum control.

 

 

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded


renayama19661014 at ybb

Aug 24, 2010, 6:20 PM

Post #1 of 10 (742 views)
Permalink
About movement of the Quorum control.

Hi Developers of Heartbeat,

When we combined Pacemaker with Heartbeat, we understand that Quorum control does not work well.

For example, it occurs when a cluster consisted of plural nodes when I set it besides
no-quorum-policy=ignore.

We know that this is considerably always the problem that was already known.

The problem occurs by the difference of the timing of the detection when a node was divided.

We think that there is the problem in Heartbeat.
* There may be the problem in CCM.
* Because the reason is because the problem does not occur when it combined Pacemaker with corosync
to notify of node division definitely.

Our many users are going to use the Quorum control in Heartbeat.

Heartbeat has to notify pacemaker of a change of right node constitution like corosync.

Is there the plan when Quorum control of Heartbeat becomes right?

Best Regards,
Hideo Yamauchi.

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


andrew at beekhof

Aug 26, 2010, 4:33 AM

Post #2 of 10 (725 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On Wed, Aug 25, 2010 at 3:20 AM, <renayama19661014 [at] ybb> wrote:
> Hi Developers of Heartbeat,
>
> When we combined Pacemaker with Heartbeat, we understand that Quorum control does not work well.
>
> For example, it occurs when a cluster consisted of plural nodes when I set it besides
> no-quorum-policy=ignore.
>
> We know that this is considerably always the problem that was already known.
>
> The problem occurs by the difference of the timing of the detection when a node was divided.
>
> We think that there is the problem in Heartbeat.
>  * There may be the problem in CCM.
>  * Because the reason is because the problem does not occur when it combined Pacemaker with corosync
> to notify of node division definitely.
>
> Our many users are going to use the Quorum control in Heartbeat.
>
> Heartbeat has to notify pacemaker of a change of right node constitution like corosync.
>
> Is there the plan when Quorum control of Heartbeat becomes right?

I doubt it, I think development on heartbeat is at an end and
maintenance is limited to regressions.
But maybe lge would like to comment further.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


andrew at beekhof

Aug 26, 2010, 4:33 AM

Post #3 of 10 (726 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On Wed, Aug 25, 2010 at 3:20 AM, <renayama19661014 [at] ybb> wrote:
> Hi Developers of Heartbeat,
>
> When we combined Pacemaker with Heartbeat, we understand that Quorum control does not work well.
>
> For example, it occurs when a cluster consisted of plural nodes when I set it besides
> no-quorum-policy=ignore.
>
> We know that this is considerably always the problem that was already known.
>
> The problem occurs by the difference of the timing of the detection when a node was divided.
>
> We think that there is the problem in Heartbeat.
>  * There may be the problem in CCM.
>  * Because the reason is because the problem does not occur when it combined Pacemaker with corosync
> to notify of node division definitely.
>
> Our many users are going to use the Quorum control in Heartbeat.
>
> Heartbeat has to notify pacemaker of a change of right node constitution like corosync.
>
> Is there the plan when Quorum control of Heartbeat becomes right?

I doubt it, I think development on heartbeat is at an end and
maintenance is limited to regressions.
But maybe lge would like to comment further.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


andrew at beekhof

Aug 26, 2010, 4:33 AM

Post #4 of 10 (727 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On Wed, Aug 25, 2010 at 3:20 AM, <renayama19661014 [at] ybb> wrote:
> Hi Developers of Heartbeat,
>
> When we combined Pacemaker with Heartbeat, we understand that Quorum control does not work well.
>
> For example, it occurs when a cluster consisted of plural nodes when I set it besides
> no-quorum-policy=ignore.
>
> We know that this is considerably always the problem that was already known.
>
> The problem occurs by the difference of the timing of the detection when a node was divided.
>
> We think that there is the problem in Heartbeat.
>  * There may be the problem in CCM.
>  * Because the reason is because the problem does not occur when it combined Pacemaker with corosync
> to notify of node division definitely.
>
> Our many users are going to use the Quorum control in Heartbeat.
>
> Heartbeat has to notify pacemaker of a change of right node constitution like corosync.
>
> Is there the plan when Quorum control of Heartbeat becomes right?

I doubt it, I think development on heartbeat is at an end and
maintenance is limited to regressions.
But maybe lge would like to comment further.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


renayama19661014 at ybb

Aug 26, 2010, 11:12 PM

Post #5 of 10 (722 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

Hi Andrew,

Thank you for comment.

> I doubt it, I think development on heartbeat is at an end and
> maintenance is limited to regressions.

We understand it enough.
However, it is very difficult for us to wait for corosync to be stable.

> But maybe lge would like to comment further.

Well.
Let's wait for more comment.....

Best Regards,

--- Andrew Beekhof <andrew [at] beekhof> wrote:

> On Wed, Aug 25, 2010 at 3:20 AM, <renayama19661014 [at] ybb> wrote:
> > Hi Developers of Heartbeat,
> >
> > When we combined Pacemaker with Heartbeat, we understand that Quorum control does not work
> well.
> >
> > For example, it occurs when a cluster consisted of plural nodes when I set it besides
> > no-quorum-policy=ignore.
> >
> > We know that this is considerably always the problem that was already known.
> >
> > The problem occurs by the difference of the timing of the detection when a node was divided.
> >
> > We think that there is the problem in Heartbeat.
> > &#65533;* There may be the problem in CCM.
> > &#65533;* Because the reason is because the problem does not occur when it combined Pacemaker with
> corosync
> > to notify of node division definitely.
> >
> > Our many users are going to use the Quorum control in Heartbeat.
> >
> > Heartbeat has to notify pacemaker of a change of right node constitution like corosync.
> >
> > Is there the plan when Quorum control of Heartbeat becomes right?
>
> I doubt it, I think development on heartbeat is at an end and
> maintenance is limited to regressions.
> But maybe lge would like to comment further.
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


andrew at beekhof

Aug 27, 2010, 12:53 AM

Post #6 of 10 (722 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On Fri, Aug 27, 2010 at 8:12 AM, <renayama19661014 [at] ybb> wrote:
> Hi Andrew,
>
> Thank you for comment.
>
>> I doubt it, I think development on heartbeat is at an end and
>> maintenance is limited to regressions.
>
> We understand it enough.

The biggest problem is that none of us actually understand the CCM.
I looked at it once a long time ago and I've never been so frightened
in my life.

But there's nothing stoping someone at NTT becoming a CCM expert ;-)

> However, it is very difficult for us to wait for corosync to be stable.

Its pretty close these days.
It does lack ucast and bcast support, but there are plans to address
that outside of corosync.

>
>> But maybe lge would like to comment further.
>
> Well.
> Let's wait for more comment.....
>
> Best Regards,
>
> --- Andrew Beekhof <andrew [at] beekhof> wrote:
>
>> On Wed, Aug 25, 2010 at 3:20 AM,  <renayama19661014 [at] ybb> wrote:
>> > Hi Developers of Heartbeat,
>> >
>> > When we combined Pacemaker with Heartbeat, we understand that Quorum control does not work
>> well.
>> >
>> > For example, it occurs when a cluster consisted of plural nodes when I set it besides
>> > no-quorum-policy=ignore.
>> >
>> > We know that this is considerably always the problem that was already known.
>> >
>> > The problem occurs by the difference of the timing of the detection when a node was divided.
>> >
>> > We think that there is the problem in Heartbeat.
>> > &#65533;* There may be the problem in CCM.
>> > &#65533;* Because the reason is because the problem does not occur when it combined Pacemaker with
>> corosync
>> > to notify of node division definitely.
>> >
>> > Our many users are going to use the Quorum control in Heartbeat.
>> >
>> > Heartbeat has to notify pacemaker of a change of right node constitution like corosync.
>> >
>> > Is there the plan when Quorum control of Heartbeat becomes right?
>>
>> I doubt it, I think development on heartbeat is at an end and
>> maintenance is limited to regressions.
>> But maybe lge would like to comment further.
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


lmb at novell

Aug 27, 2010, 7:04 AM

Post #7 of 10 (711 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On 2010-08-27T09:53:32, Andrew Beekhof <andrew [at] beekhof> wrote:

> > However, it is very difficult for us to wait for corosync to be stable.
> Its pretty close these days.
> It does lack ucast and bcast support, but there are plans to address
> that outside of corosync.

The current corosync is really quite stable, certainly as stable as CCM.
If it isn't, there's at least someone we can ask for active help.

bcast is handled by corosync, by the way; and unicast support may also
be coming.

I'm curious what you mean by "outside of corosync", though - how would
the network protocol be addressed outside of corosync?


Regards,
Lars

--
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


andrew at beekhof

Aug 27, 2010, 7:15 AM

Post #8 of 10 (715 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On Fri, Aug 27, 2010 at 4:04 PM, Lars Marowsky-Bree <lmb [at] novell> wrote:
> On 2010-08-27T09:53:32, Andrew Beekhof <andrew [at] beekhof> wrote:
>
>> > However, it is very difficult for us to wait for corosync to be stable.
>> Its pretty close these days.
>> It does lack ucast and bcast support, but there are plans to address
>> that outside of corosync.
>
> The current corosync is really quite stable, certainly as stable as CCM.
> If it isn't, there's at least someone we can ask for active help.
>
> bcast is handled by corosync, by the way; and unicast support may also
> be coming.
>
> I'm curious what you mean by "outside of corosync", though - how would
> the network protocol be addressed outside of corosync?

http://www.linuxplumbersconf.org/2010/ocw/proposals/1065

Basically its intended to be a generically useful layer sitting
underneath corosync.
Steve was skeptical at first but is apparently on board with the idea
now - to the point where its the preferred approach for providing
ucast capabilities.
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


andrew at beekhof

Aug 27, 2010, 7:16 AM

Post #9 of 10 (712 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On Fri, Aug 27, 2010 at 4:04 PM, Lars Marowsky-Bree <lmb [at] novell> wrote:
> bcast is handled by corosync, by the way; and unicast support may also
> be coming.

Yeah, but its not supported on RhEL which means he's not actively
testing it or fixing bugs.
Your milage may vary :-)
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


lmb at novell

Aug 30, 2010, 2:24 AM

Post #10 of 10 (664 views)
Permalink
Re: About movement of the Quorum control. [In reply to]

On 2010-08-27T16:16:45, Andrew Beekhof <andrew [at] beekhof> wrote:

> > bcast is handled by corosync, by the way; and unicast support may also
> > be coming.
> Yeah, but its not supported on RhEL which means he's not actively
> testing it or fixing bugs.
> Your milage may vary :-)

We do test it though - at a network level, it is not really
distinguishable from using the bcast address as the multicast address.
It is subnet-local multicast, and there's really no reason to believe
why it shouldn't work well.

(Many lower-end switches actually treat mcast like bcast anyway, and
just broadcast it to every port ...)


Regards,
Lars

--
Architect Storage/HA, OPS Engineering, Novell, Inc.
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.