Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

[ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


jjzhang at suse

Dec 4, 2011, 10:18 PM

Post #1 of 12 (1179 views)
Permalink
[ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker

Hello everyone,

I'm happy to announce to the Booth cluster ticket manager, which is part
of the key feature for pacemaker in 2011 - improving support for
multi-site clusters.

Multi-site clusters can be considered as overlay clusters where each
cluster site corresponds to a cluster node in a traditional cluster. The
overlay cluster is managed by the booth mechanism. It guarantees that
the cluster resources will be highly available across different cluster
sites takes. This is achieved by using so-called tickets that are
treated as failover domain between cluster sites, in case a site should
be down.

Booth is designed to be an add-on of pacemaker, and now it is also
hosted in ClusterLabs, together with pacemaker. You can find it from:

https://github.com/ClusterLabs/booth

Now, booth is still in heavy development, so it may not work for you for
the time being;) But I'll be working on it ...

Review and comments are highly appreciated!

Thanks,
Jiaju


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Jan 5, 2012, 12:35 AM

Post #2 of 12 (1104 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi Jiaju

I tried booth and I found this error

crm_ticket: invalid option -- 'n'

Is that right -n option in pacemaker tools?
I think -v option is right.
then send the attated patch file.
check the file please.

Best Regards, Taihun


(2011/12/05 15:18), Jiaju Zhang wrote:
> Hello everyone,
>
> I'm happy to announce to the Booth cluster ticket manager, which is part
> of the key feature for pacemaker in 2011 - improving support for
> multi-site clusters.
>
> Multi-site clusters can be considered as “overlay” clusters where each
> cluster site corresponds to a cluster node in a traditional cluster. The
> overlay cluster is managed by the booth mechanism. It guarantees that
> the cluster resources will be highly available across different cluster
> sites takes. This is achieved by using so-called tickets that are
> treated as failover domain between cluster sites, in case a site should
> be down.
>
> Booth is designed to be an add-on of pacemaker, and now it is also
> hosted in ClusterLabs, together with pacemaker. You can find it from:
>
> https://github.com/ClusterLabs/booth
>
> Now, booth is still in heavy development, so it may not work for you for
> the time being;) But I'll be working on it ...
>
> Review and comments are highly appreciated!
>
> Thanks,
> Jiaju
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>
Attachments: pacemaker.c.patch (0.79 KB)


jjzhang at suse

Jan 6, 2012, 7:27 AM

Post #3 of 12 (1058 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi Taihun,

On Thu, 2012-01-05 at 17:35 +0900, $BM{BY7.(B wrote:
> Hi Jiaju
>
> I tried booth and I found this error
>
> crm_ticket: invalid option -- 'n'
>
> Is that right -n option in pacemaker tools?
> I think -v option is right.
> then send the attated patch file.
> check the file please.

Yes, "-v" is the right option. Thanks for the patch!
However I have merged the patches from Daniel yesterday
which have the same fix, so can't merge your patch this
time;)

Many thanks for looking at booth!

Thanks,
Jiaju


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Jan 9, 2012, 11:58 PM

Post #4 of 12 (1052 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi Jiaju

Thanks for your reply.
I confirmed source code.

And I have some question about booth.

This is my simple test note

1.I make installed booth with Pacemaker1.1 in RHEL6.1
2.I have two-site(siteA,B) cluster and one-arbitrator
3.Each site that is consists of one node have dummy resource
4.booth started in each site and arbitrator
5.A ticket grants dummy resource in siteA
6.halt siteA

So I expected a ticket is revoked in siteA and automatic failover to
siteB but it didn't work.
I am wondering about the connection with each site and arbitrator.....

Q.
1.Does it work just only in Suse Enterprise11 SP2?
2.I like to confirm the booth in RHEL6.1. Is that Possible?
3.Could you have any information about booth except suse docment?
4.At the time of failure, Is the function to move the authority of
ticket automatically working correctly in suse in present?

Could you give me any advise?

Best Regards, Taihun


(2012/01/07 0:27), Jiaju Zhang wrote:
> Hi Taihun,
>
> On Thu, 2012-01-05 at 17:35 +0900, $BM{BY7.(B wrote:
>> Hi Jiaju
>>
>> I tried booth and I found this error
>>
>> crm_ticket: invalid option -- 'n'
>>
>> Is that right -n option in pacemaker tools?
>> I think -v option is right.
>> then send the attated patch file.
>> check the file please.
> Yes, "-v" is the right option. Thanks for the patch!
> However I have merged the patches from Daniel yesterday
> which have the same fix, so can't merge your patch this
> time;)
>
> Many thanks for looking at booth!
>
> Thanks,
> Jiaju
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Feb 6, 2012, 6:09 PM

Post #5 of 12 (989 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi Jiaju

I am testing about working of booth while investigating booth source code.

I don't understand ticket grant and revoke process perfectly that is
related to connecting each booth
so I would like to know booth's working that would be matching your
offering source code.

Could you give me information about booth sequences that would be the
ticket's grant,revoke,lease logic and working of ticket's expiry time.

when do you think the booth's working is fixed and completed ?

Is there anything to help you about booth's implementation or etc?

Best Regards, Taihun

(2011/12/05 15:18), Jiaju Zhang wrote:
> Hello everyone,
>
> I'm happy to announce to the Booth cluster ticket manager, which is part
> of the key feature for pacemaker in 2011 - improving support for
> multi-site clusters.
>
> Multi-site clusters can be considered as “overlay” clusters where each
> cluster site corresponds to a cluster node in a traditional cluster. The
> overlay cluster is managed by the booth mechanism. It guarantees that
> the cluster resources will be highly available across different cluster
> sites takes. This is achieved by using so-called tickets that are
> treated as failover domain between cluster sites, in case a site should
> be down.
>
> Booth is designed to be an add-on of pacemaker, and now it is also
> hosted in ClusterLabs, together with pacemaker. You can find it from:
>
> https://github.com/ClusterLabs/booth
>
> Now, booth is still in heavy development, so it may not work for you for
> the time being;) But I'll be working on it ...
>
> Review and comments are highly appreciated!
>
> Thanks,
> Jiaju
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>
>


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


jjzhang at suse

Feb 7, 2012, 10:11 PM

Post #6 of 12 (979 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

On Tue, 2012-02-07 at 11:09 +0900, $BM{BY7.(B wrote:
> Hi Jiaju
>
> I am testing about working of booth while investigating booth source code.
>
> I don't understand ticket grant and revoke process perfectly that is
> related to connecting each booth
> so I would like to know booth's working that would be matching your
> offering source code.
>
> Could you give me information about booth sequences that would be the
> ticket's grant,revoke,lease logic and working of ticket's expiry time.

General speaking, you would want to grant some ticket on certain site
initially, which means the corresponding resources can be run at that
site. For the lease logic, ticket granting means that site has the
ticket lease. The lease has an expiry time, after the expiry time, that
lease is expired and the corresponding resources can't be run at that
site any longer.
If the site which has the original ticket granting is alive, it will
renew the lease before the ticket expired, but if that site is broken,
when the lease is expired, the lease logic will go into election stage
and a new site will get the ticket lease, thus the resources will be
able to run at the new site.
You can revoke the ticket from the site as well, but in most cases, you
may not want to do this. The possible scenario I can think of is when
the admin wants to do some maintenance work, or wants to do the ticket
management manually.

>
> when do you think the booth's working is fixed and completed ?

Oh, I have not finished it yet;) But I'm still working on it, since I
also have some other tasks, maybe the progress is not fast these days;)

>
> Is there anything to help you about booth's implementation or etc?

The framework is finished, but there are still some bugs in it, so the
code may not work for you for the time being, I'll be more than happy if
anyone can help to fix bugs, or develop new features;)
For the short term, I think adding the man pages, documentation and some
automation test programs/scripts would be very good. For the long term,
I also have something new in my mind, maybe I should add a TODO to
document it later.

Well, the primary thing for now is to fix current bugs to make it really
working, and I myself will spend more time on it these two weeks;)

Thanks,
Jiaju

>
> Best Regards, Taihun
>
> (2011/12/05 15:18), Jiaju Zhang wrote:
> > Hello everyone,
> >
> > I'm happy to announce to the Booth cluster ticket manager, which is part
> > of the key feature for pacemaker in 2011 - improving support for
> > multi-site clusters.
> >
> > Multi-site clusters can be considered as $B!H(Boverlay$B!I(B clusters where each
> > cluster site corresponds to a cluster node in a traditional cluster. The
> > overlay cluster is managed by the booth mechanism. It guarantees that
> > the cluster resources will be highly available across different cluster
> > sites takes. This is achieved by using so-called tickets that are
> > treated as failover domain between cluster sites, in case a site should
> > be down.
> >
> > Booth is designed to be an add-on of pacemaker, and now it is also
> > hosted in ClusterLabs, together with pacemaker. You can find it from:
> >
> > https://github.com/ClusterLabs/booth
> >
> > Now, booth is still in heavy development, so it may not work for you for
> > the time being;) But I'll be working on it ...
> >
> > Review and comments are highly appreciated!
> >
> > Thanks,
> > Jiaju
> >
> >
> > _______________________________________________
> > Pacemaker mailing list: Pacemaker [at] oss
> > http://oss.clusterlabs.org/mailman/listinfo/pacemaker
> >
> > Project Home: http://www.clusterlabs.org
> > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> > Bugs: http://bugs.clusterlabs.org
> >
> >
>


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Feb 8, 2012, 12:05 AM

Post #7 of 12 (975 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi Jiaju

Thank you for your reply.


(2012/02/08 15:11), Jiaju Zhang wrote:
> On Tue, 2012-02-07 at 11:09 +0900, $BM{BY7.(B wrote:
>> Hi Jiaju
>>
>> I am testing about working of booth while investigating booth source code.
>>
>> I don't understand ticket grant and revoke process perfectly that is
>> related to connecting each booth
>> so I would like to know booth's working that would be matching your
>> offering source code.
>>
>> Could you give me information about booth sequences that would be the
>> ticket's grant,revoke,lease logic and working of ticket's expiry time.
> General speaking, you would want to grant some ticket on certain site
> initially, which means the corresponding resources can be run at that
> site. For the lease logic, ticket granting means that site has the
> ticket lease. The lease has an expiry time, after the expiry time, that
> lease is expired and the corresponding resources can't be run at that
> site any longer.
> If the site which has the original ticket granting is alive, it will
> renew the lease before the ticket expired, but if that site is broken,
> when the lease is expired, the lease logic will go into election stage
> and a new site will get the ticket lease, thus the resources will be
> able to run at the new site.
> You can revoke the ticket from the site as well, but in most cases, you
> may not want to do this. The possible scenario I can think of is when
> the admin wants to do some maintenance work, or wants to do the ticket
> management manually.
We have understood booth working as your reply.
but I am wondering booth working process when it occurred splitbrain in
each sites.

for example.

siteA has ticketA grant.
siteB has no grant.
siteB is Arbitrator.

if siteA connection fails and isolated from each sites,
automatically failover to siteB.
but ticketA did not revoke in siteA.

I think that ticket granting is alive in siteB and revoking in siteA.
Is that right that I think?
>> when do you think the booth's working is fixed and completed ?
> Oh, I have not finished it yet;) But I'm still working on it, since I
> also have some other tasks, maybe the progress is not fast these days;)
>
>> Is there anything to help you about booth's implementation or etc?
> The framework is finished, but there are still some bugs in it, so the
> code may not work for you for the time being, I'll be more than happy if
> anyone can help to fix bugs, or develop new features;)
> For the short term, I think adding the man pages, documentation and some
> automation test programs/scripts would be very good. For the long term,
> I also have something new in my mind, maybe I should add a TODO to
> document it later.
>
> Well, the primary thing for now is to fix current bugs to make it really
> working, and I myself will spend more time on it these two weeks;)
If you have any bugs that you recognized already,Please you give me that
information?
And If we find booth bugs as we will send the report and patch as possible.


Best Regards, Taihun

>
> Thanks,
> Jiaju
>
>> Best Regards, Taihun
>>
>> (2011/12/05 15:18), Jiaju Zhang wrote:
>>> Hello everyone,
>>>
>>> I'm happy to announce to the Booth cluster ticket manager, which is part
>>> of the key feature for pacemaker in 2011 - improving support for
>>> multi-site clusters.
>>>
>>> Multi-site clusters can be considered as $B!H(Boverlay$B!I(B clusters where each
>>> cluster site corresponds to a cluster node in a traditional cluster. The
>>> overlay cluster is managed by the booth mechanism. It guarantees that
>>> the cluster resources will be highly available across different cluster
>>> sites takes. This is achieved by using so-called tickets that are
>>> treated as failover domain between cluster sites, in case a site should
>>> be down.
>>>
>>> Booth is designed to be an add-on of pacemaker, and now it is also
>>> hosted in ClusterLabs, together with pacemaker. You can find it from:
>>>
>>> https://github.com/ClusterLabs/booth
>>>
>>> Now, booth is still in heavy development, so it may not work for you for
>>> the time being;) But I'll be working on it ...
>>>
>>> Review and comments are highly appreciated!
>>>
>>> Thanks,
>>> Jiaju
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker [at] oss
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>
>


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


jjzhang at suse

Feb 8, 2012, 1:33 AM

Post #8 of 12 (987 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

On Wed, 2012-02-08 at 17:05 +0900, $BM{BY7.(B wrote:
> Hi Jiaju
>
> Thank you for your reply.
>
>
> (2012/02/08 15:11), Jiaju Zhang wrote:
> > On Tue, 2012-02-07 at 11:09 +0900, $BM{BY7.(B wrote:
> >> Hi Jiaju
> >>
> >> I am testing about working of booth while investigating booth source code.
> >>
> >> I don't understand ticket grant and revoke process perfectly that is
> >> related to connecting each booth
> >> so I would like to know booth's working that would be matching your
> >> offering source code.
> >>
> >> Could you give me information about booth sequences that would be the
> >> ticket's grant,revoke,lease logic and working of ticket's expiry time.
> > General speaking, you would want to grant some ticket on certain site
> > initially, which means the corresponding resources can be run at that
> > site. For the lease logic, ticket granting means that site has the
> > ticket lease. The lease has an expiry time, after the expiry time, that
> > lease is expired and the corresponding resources can't be run at that
> > site any longer.
> > If the site which has the original ticket granting is alive, it will
> > renew the lease before the ticket expired, but if that site is broken,
> > when the lease is expired, the lease logic will go into election stage
> > and a new site will get the ticket lease, thus the resources will be
> > able to run at the new site.
> > You can revoke the ticket from the site as well, but in most cases, you
> > may not want to do this. The possible scenario I can think of is when
> > the admin wants to do some maintenance work, or wants to do the ticket
> > management manually.
> We have understood booth working as your reply.
> but I am wondering booth working process when it occurred splitbrain in
> each sites.
>
> for example.
>
> siteA has ticketA grant.
> siteB has no grant.
> siteB is Arbitrator.

Well, site B could not be the arbitrator. Arbitrator doesn't need to be
a cluster site, it can be just a machine, but it should be in the 3rd
site;)

>
> if siteA connection fails and isolated from each sites,
> automatically failover to siteB.
> but ticketA did not revoke in siteA.

The ticketA will expire in site A and then it will be revoked. The
algorithm is able to guarantee before the ticket is granted to siteB, it
has been revoked on siteA.
However, current code has some bug on the ticket revoking;) I have an
unfinished patch in hand and plan to submit it this week.

>
> I think that ticket granting is alive in siteB and revoking in siteA.
> Is that right that I think?
> >> when do you think the booth's working is fixed and completed ?
> > Oh, I have not finished it yet;) But I'm still working on it, since I
> > also have some other tasks, maybe the progress is not fast these days;)
> >
> >> Is there anything to help you about booth's implementation or etc?
> > The framework is finished, but there are still some bugs in it, so the
> > code may not work for you for the time being, I'll be more than happy if
> > anyone can help to fix bugs, or develop new features;)
> > For the short term, I think adding the man pages, documentation and some
> > automation test programs/scripts would be very good. For the long term,
> > I also have something new in my mind, maybe I should add a TODO to
> > document it later.
> >
> > Well, the primary thing for now is to fix current bugs to make it really
> > working, and I myself will spend more time on it these two weeks;)
> If you have any bugs that you recognized already,Please you give me that
> information?

OK. A known issue is about the ticket revoking which I'm working on.
Others I think I should look through the novell bugzilla first, and then
send you the links or the description if you don't have a novell
bugzilla account. Sure you're welcome to report issue and send patch
here;)

> And If we find booth bugs as we will send the report and patch as possible.

Many many thanks;)

Thanks,
Jiaju



_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Feb 8, 2012, 2:27 AM

Post #9 of 12 (975 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi Jiaju

(2012/02/08 18:33), Jiaju Zhang wrote:
> On Wed, 2012-02-08 at 17:05 +0900, $BM{BY7.(B wrote:
>> Hi Jiaju
>>
>> Thank you for your reply.
>>
>>
>> (2012/02/08 15:11), Jiaju Zhang wrote:
>>> On Tue, 2012-02-07 at 11:09 +0900, $BM{BY7.(B wrote:
>>>> Hi Jiaju
>>>>
>>>> I am testing about working of booth while investigating booth source code.
>>>>
>>>> I don't understand ticket grant and revoke process perfectly that is
>>>> related to connecting each booth
>>>> so I would like to know booth's working that would be matching your
>>>> offering source code.
>>>>
>>>> Could you give me information about booth sequences that would be the
>>>> ticket's grant,revoke,lease logic and working of ticket's expiry time.
>>> General speaking, you would want to grant some ticket on certain site
>>> initially, which means the corresponding resources can be run at that
>>> site. For the lease logic, ticket granting means that site has the
>>> ticket lease. The lease has an expiry time, after the expiry time, that
>>> lease is expired and the corresponding resources can't be run at that
>>> site any longer.
>>> If the site which has the original ticket granting is alive, it will
>>> renew the lease before the ticket expired, but if that site is broken,
>>> when the lease is expired, the lease logic will go into election stage
>>> and a new site will get the ticket lease, thus the resources will be
>>> able to run at the new site.
>>> You can revoke the ticket from the site as well, but in most cases, you
>>> may not want to do this. The possible scenario I can think of is when
>>> the admin wants to do some maintenance work, or wants to do the ticket
>>> management manually.
>> We have understood booth working as your reply.
>> but I am wondering booth working process when it occurred splitbrain in
>> each sites.
>>
>> for example.
>>
>> siteA has ticketA grant.
>> siteB has no grant.
>> siteB is Arbitrator.
I mistook siteB is not Arbitrator. it's a siteC. sorry.

> Well, site B could not be the arbitrator. Arbitrator doesn't need to be
> a cluster site, it can be just a machine, but it should be in the 3rd
> site;)
>
>> if siteA connection fails and isolated from each sites,
>> automatically failover to siteB.
>> but ticketA did not revoke in siteA.
> The ticketA will expire in site A and then it will be revoked. The
> algorithm is able to guarantee before the ticket is granted to siteB, it
> has been revoked on siteA.
> However, current code has some bug on the ticket revoking;) I have an
> unfinished patch in hand and plan to submit it this week.
All right. I will be waiting for it.
>
>> I think that ticket granting is alive in siteB and revoking in siteA.
>> Is that right that I think?
>>>> when do you think the booth's working is fixed and completed ?
>>> Oh, I have not finished it yet;) But I'm still working on it, since I
>>> also have some other tasks, maybe the progress is not fast these days;)
>>>
>>>> Is there anything to help you about booth's implementation or etc?
>>> The framework is finished, but there are still some bugs in it, so the
>>> code may not work for you for the time being, I'll be more than happy if
>>> anyone can help to fix bugs, or develop new features;)
>>> For the short term, I think adding the man pages, documentation and some
>>> automation test programs/scripts would be very good. For the long term,
>>> I also have something new in my mind, maybe I should add a TODO to
>>> document it later.
>>>
>>> Well, the primary thing for now is to fix current bugs to make it really
>>> working, and I myself will spend more time on it these two weeks;)
>> If you have any bugs that you recognized already,Please you give me that
>> information?
> OK. A known issue is about the ticket revoking which I'm working on.
> Others I think I should look through the novell bugzilla first, and then
> send you the links or the description if you don't have a novell
> bugzilla account. Sure you're welcome to report issue and send patch
> here;)

OK! I will be waiting for it.

Thanks.
Taihun

>> And If we find booth bugs as we will send the report and patch as possible.
> Many many thanks;)
>
> Thanks,
> Jiaju
>
>
>
>



_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Feb 17, 2012, 12:54 AM

Post #10 of 12 (934 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi, Jiaju

I am trying behavior of revoke Implementation manually.

I have questions of revoke.

If I revoked in one of grant ticket site, It would be free from grant
status.
but the other site runs automatic failover after that!
Is it correct specifications?

I think the expire timer is free in all of sites in that operation.
What do you think about that?

https://bugzilla.novell.com/index.cgi
I can't find anything of booth articles in that links.
Is it right the links of the novell bugzilla?

Best Regards, Taihun

(2012/02/08 18:33), Jiaju Zhang wrote:
> On Wed, 2012-02-08 at 17:05 +0900, $BM{BY7.(B wrote:
>> Hi Jiaju
>>
>> Thank you for your reply.
>>
>>
>> (2012/02/08 15:11), Jiaju Zhang wrote:
>>> On Tue, 2012-02-07 at 11:09 +0900, $BM{BY7.(B wrote:
>>>> Hi Jiaju
>>>>
>>>> I am testing about working of booth while investigating booth source code.
>>>>
>>>> I don't understand ticket grant and revoke process perfectly that is
>>>> related to connecting each booth
>>>> so I would like to know booth's working that would be matching your
>>>> offering source code.
>>>>
>>>> Could you give me information about booth sequences that would be the
>>>> ticket's grant,revoke,lease logic and working of ticket's expiry time.
>>> General speaking, you would want to grant some ticket on certain site
>>> initially, which means the corresponding resources can be run at that
>>> site. For the lease logic, ticket granting means that site has the
>>> ticket lease. The lease has an expiry time, after the expiry time, that
>>> lease is expired and the corresponding resources can't be run at that
>>> site any longer.
>>> If the site which has the original ticket granting is alive, it will
>>> renew the lease before the ticket expired, but if that site is broken,
>>> when the lease is expired, the lease logic will go into election stage
>>> and a new site will get the ticket lease, thus the resources will be
>>> able to run at the new site.
>>> You can revoke the ticket from the site as well, but in most cases, you
>>> may not want to do this. The possible scenario I can think of is when
>>> the admin wants to do some maintenance work, or wants to do the ticket
>>> management manually.
>> We have understood booth working as your reply.
>> but I am wondering booth working process when it occurred splitbrain in
>> each sites.
>>
>> for example.
>>
>> siteA has ticketA grant.
>> siteB has no grant.
>> siteB is Arbitrator.
> Well, site B could not be the arbitrator. Arbitrator doesn't need to be
> a cluster site, it can be just a machine, but it should be in the 3rd
> site;)
>
>> if siteA connection fails and isolated from each sites,
>> automatically failover to siteB.
>> but ticketA did not revoke in siteA.
> The ticketA will expire in site A and then it will be revoked. The
> algorithm is able to guarantee before the ticket is granted to siteB, it
> has been revoked on siteA.
> However, current code has some bug on the ticket revoking;) I have an
> unfinished patch in hand and plan to submit it this week.
>
>> I think that ticket granting is alive in siteB and revoking in siteA.
>> Is that right that I think?
>>>> when do you think the booth's working is fixed and completed ?
>>> Oh, I have not finished it yet;) But I'm still working on it, since I
>>> also have some other tasks, maybe the progress is not fast these days;)
>>>
>>>> Is there anything to help you about booth's implementation or etc?
>>> The framework is finished, but there are still some bugs in it, so the
>>> code may not work for you for the time being, I'll be more than happy if
>>> anyone can help to fix bugs, or develop new features;)
>>> For the short term, I think adding the man pages, documentation and some
>>> automation test programs/scripts would be very good. For the long term,
>>> I also have something new in my mind, maybe I should add a TODO to
>>> document it later.
>>>
>>> Well, the primary thing for now is to fix current bugs to make it really
>>> working, and I myself will spend more time on it these two weeks;)
>> If you have any bugs that you recognized already,Please you give me that
>> information?
> OK. A known issue is about the ticket revoking which I'm working on.
> Others I think I should look through the novell bugzilla first, and then
> send you the links or the description if you don't have a novell
> bugzilla account. Sure you're welcome to report issue and send patch
> here;)
>
>> And If we find booth bugs as we will send the report and patch as possible.
> Many many thanks;)
>
> Thanks,
> Jiaju
>
>
>
>


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


jjzhang at suse

Feb 20, 2012, 6:08 AM

Post #11 of 12 (940 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

On Fri, 2012-02-17 at 17:54 +0900, $BM{BY7.(B wrote:
> Hi, Jiaju
>
> I am trying behavior of revoke Implementation manually.
>
> I have questions of revoke.
>
> If I revoked in one of grant ticket site, It would be free from grant
> status.
> but the other site runs automatic failover after that!
> Is it correct specifications?

Yes, currently the logic is as this. My original thought is to use the
ticket revoke command to do some maintenance work by the admin. One
possible scenario is the admin run the revoke command to revoke the
ticket from one site, and then do the maintenance work at that site.
If auto-failover has been configured, the ticket will failover to
another site to make the service always available. If auto-failover
has not been set, then ticket won't be failovered.

>
> I think the expire timer is free in all of sites in that operation.
> What do you think about that?

I think what you think is reasonable, and current revoke logic might
confuse people;) So I'm thinking of changing it now.

>
> https://bugzilla.novell.com/index.cgi
> I can't find anything of booth articles in that links.
> Is it right the links of the novell bugzilla?

Yes, that is the link for hosting bugs. But for the booth articles, it
is not there. a draft document for booth is at:

http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.geo.html

And when the GEO cluster is released, there will be some revised
documents as well;)

Thanks,
Jiaju




_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


leetaehoon at intellilink

Feb 20, 2012, 6:49 PM

Post #12 of 12 (918 views)
Permalink
Re: [ANNOUNCE] The Booth Cluster Ticket Manager - part of multi-site support in pacemaker [In reply to]

Hi, Jiaju

Thank you for your reply.

(2012/02/20 23:08), Jiaju Zhang wrote:
> On Fri, 2012-02-17 at 17:54 +0900, $BM{BY7.(B wrote:
>> Hi, Jiaju
>>
>> I am trying behavior of revoke Implementation manually.
>>
>> I have questions of revoke.
>>
>> If I revoked in one of grant ticket site, It would be free from grant
>> status.
>> but the other site runs automatic failover after that!
>> Is it correct specifications?
> Yes, currently the logic is as this. My original thought is to use the
> ticket revoke command to do some maintenance work by the admin. One
> possible scenario is the admin run the revoke command to revoke the
> ticket from one site, and then do the maintenance work at that site.
> If auto-failover has been configured, the ticket will failover to
> another site to make the service always available. If auto-failover
> has not been set, then ticket won't be failovered.
>
>> I think the expire timer is free in all of sites in that operation.
>> What do you think about that?
> I think what you think is reasonable, and current revoke logic might
> confuse people;) So I'm thinking of changing it now.
I agree with you.
I look forward to release it.
If it will be done I will test of it.

>> https://bugzilla.novell.com/index.cgi
>> I can't find anything of booth articles in that links.
>> Is it right the links of the novell bugzilla?

> Yes, that is the link for hosting bugs. But for the booth articles, it
> is not there. a draft document for booth is at:
>
> http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/cha.ha.geo.html
I searched the booth bugs in novell bugzilla. but I couldn't find it.
Would you have registered any bugs that you have recognized ?
and Can I register booth bugs in the bugzilla If I find any booth bugs?

Best Regards, Taihun





_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.