Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

corosync vs. pacemaker 1.1

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


bence at noc

Jan 25, 2012, 7:08 AM

Post #1 of 8 (1686 views)
Permalink
corosync vs. pacemaker 1.1

Hi,

I am newbie to the clustering and I am trying to build a two node
active/passive cluster based upon the documentation:
http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf

My systems are Fedora 14, uptodate. After forming the cluster as wrote,
I started to test it. (resources: drbd-> lvm-> fs ->group of services)
Resources moved around, nodes rebooted and killed (first I tried it in
virtual environment then also on real machines).

After some events the two nodes ended up in a kind of state of
split-brain. The crm_mon showed me that the other node is offline at
both nodes although the drbd subsystem showed everything in sync and
working. The network was not the issue (ping, tcp and udp communications
were fine). Nothing changed from the network view.

At first the rejoining took place quiet well, but some more events after
it took longer and after more event it didn't. The network dump showed
me the multicast packets still coming and going. At corosync (crm_node
-l) the other node didn't appeared both on them. After trying
configuring the cib logs was full of messages like "<the other node>:
not in our membership".

I tried to erase the config (crm configure erase, cibadmin -E -f) but it
worked only locally. I noticed that the pacemaker process didn't started
up normally on the node that was booting after the other. I also tried
to remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but
only the resources are gone. It didn't help on forming a cluster without
resources. The pacemaker process exited some 20 minutes after it
started. Manual starting was the same.

After digging into google for answers I found nothing helpful. From
running tips I changed in the /etc/corosync/service.d/pcmk file the
version to 1.1 (this is the version of the pacemaker in this distro). I
realized that the cluster processes were startup from corosync itself
not by pacemaker. Which could be omitted. The cluster forming is stable
after this change even after many many events.

Now I reread the document mentioned above, and I wonder why it wrote the
"Important notice" on page 37. What is wrong theoretically with my
scenario? Why does it working? Why didn't work the config suggested by
the document?

Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core,
512Mb ram, 10G disk, 1G drbd on logical volume, physical volume on drbd
forming volgroup named cluster.)/node.

Then on real machines. They have more cpu cores (4), more RAM (4G) and
more disk (mirrored 750G), 180G drbd, and 100M garanteed routed link
between the nodes 5 hops away.

By the way how should one configure the corosync to work on multicast
routed network? I had to create an openvpn tap link between the real
nodes for working. The original config with public IP-s didn't worked.
Is corosync equipped to cope with the multicast pim messages? Or it was
a firewall issue.

Thanks in advance,
Bence

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


df.cluster at gmail

Jan 26, 2012, 5:13 AM

Post #2 of 8 (1617 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

Hi,

On Wed, Jan 25, 2012 at 5:08 PM, Kiss Bence <bence [at] noc> wrote:
> Hi,
>
> I am newbie to the clustering and I am trying to build a two node
> active/passive cluster based upon the documentation:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>
> My systems are Fedora 14, uptodate. After forming the cluster as wrote, I
> started to test it. (resources: drbd-> lvm-> fs ->group of services)
> Resources moved around, nodes rebooted and killed (first I tried it in
> virtual environment then also on real machines).
>
> After some events the two nodes ended up in a kind of state of split-brain.
> The crm_mon showed me that the other node is offline at both nodes although
> the drbd subsystem showed everything in sync and working. The network was
> not the issue (ping, tcp and udp communications were fine). Nothing changed
> from the network view.
>
> At first the rejoining took place quiet well, but some more events after it
> took longer and after more event it didn't. The network dump showed me the
> multicast packets still coming and going. At corosync (crm_node -l) the
> other node didn't appeared both on them. After trying configuring the cib
> logs was full of messages like "<the other node>: not in our membership".
>
> I tried to erase the config (crm configure erase, cibadmin -E -f) but it
> worked only locally. I noticed that the pacemaker process didn't started up
> normally on the node that was booting after the other. I also tried to
> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only the
> resources are gone. It didn't help on forming a cluster without resources.
> The pacemaker process exited some 20 minutes after it started. Manual
> starting was the same.
>
> After digging into google for answers I found nothing helpful. From running
> tips I changed in the /etc/corosync/service.d/pcmk file the version to 1.1
> (this is the version of the pacemaker in this distro). I realized that the
> cluster processes were startup from corosync itself not by pacemaker. Which
> could be omitted. The cluster forming is stable after this change even after
> many many events.
>
> Now I reread the document mentioned above, and I wonder why it wrote the
> "Important notice" on page 37. What is wrong theoretically with my scenario?
> Why does it working? Why didn't work the config suggested by the document?
>
> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core, 512Mb
> ram, 10G disk, 1G drbd on logical volume, physical  volume on drbd forming
> volgroup named cluster.)/node.
>
> Then on real machines. They have more cpu cores (4), more RAM (4G) and more
> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between the
> nodes 5 hops away.
>
> By the way how should one configure the corosync to work on multicast routed
> network? I had to create an openvpn tap link between the real nodes for
> working. The original config with public IP-s didn't worked. Is corosync
> equipped to cope with the multicast pim messages? Or it was a firewall
> issue.

First question, what versions of software are on each of the nodes?

When using multicast, corosync doesn't care about "routing" the
messages AFAIK, it relies on the network layer to do it's job. Now the
"split-brain" you mention can take place due to network interruption,
or due to missing or untested fencing as well.

Second question, do you have fencing configured?

You've mentioned 2(?) nodes "5 hops away", I'm guessing they're not in
the same datacenter. If so, did you also test the latency on the
network between endpoints? Also can you make sure PIM routing is
enabled on all of the "hops" along the way?

Your scenario seems to be a split-site, so you may be interested in
https://github.com/jjzhang/booth as well.

Regards,
Dan

>
> Thanks in advance,
> Bence
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



--
Dan Frincu
CCNA, RHCE

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


bence at noc

Jan 26, 2012, 6:40 AM

Post #3 of 8 (1627 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

Hi,

On 01/26/2012 02:13 PM, Dan Frincu wrote:
> Hi,
>
> On Wed, Jan 25, 2012 at 5:08 PM, Kiss Bence<bence [at] noc> wrote:
>> Hi,
>>
>> I am newbie to the clustering and I am trying to build a two node
>> active/passive cluster based upon the documentation:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>
>> My systems are Fedora 14, uptodate. After forming the cluster as wrote, I
>> started to test it. (resources: drbd-> lvm-> fs ->group of services)
>> Resources moved around, nodes rebooted and killed (first I tried it in
>> virtual environment then also on real machines).
>>
>> After some events the two nodes ended up in a kind of state of split-brain.
>> The crm_mon showed me that the other node is offline at both nodes although
>> the drbd subsystem showed everything in sync and working. The network was
>> not the issue (ping, tcp and udp communications were fine). Nothing changed
>> from the network view.
>>
>> At first the rejoining took place quiet well, but some more events after it
>> took longer and after more event it didn't. The network dump showed me the
>> multicast packets still coming and going. At corosync (crm_node -l) the
>> other node didn't appeared both on them. After trying configuring the cib
>> logs was full of messages like "<the other node>: not in our membership".
>>
>> I tried to erase the config (crm configure erase, cibadmin -E -f) but it
>> worked only locally. I noticed that the pacemaker process didn't started up
>> normally on the node that was booting after the other. I also tried to
>> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only the
>> resources are gone. It didn't help on forming a cluster without resources.
>> The pacemaker process exited some 20 minutes after it started. Manual
>> starting was the same.
>>
>> After digging into google for answers I found nothing helpful. From running
>> tips I changed in the /etc/corosync/service.d/pcmk file the version to 1.1
>> (this is the version of the pacemaker in this distro). I realized that the
>> cluster processes were startup from corosync itself not by pacemaker. Which
>> could be omitted. The cluster forming is stable after this change even after
>> many many events.
>>
>> Now I reread the document mentioned above, and I wonder why it wrote the
>> "Important notice" on page 37. What is wrong theoretically with my scenario?
>> Why does it working? Why didn't work the config suggested by the document?
>>
>> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core, 512Mb
>> ram, 10G disk, 1G drbd on logical volume, physical volume on drbd forming
>> volgroup named cluster.)/node.
>>
>> Then on real machines. They have more cpu cores (4), more RAM (4G) and more
>> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between the
>> nodes 5 hops away.
>>
>> By the way how should one configure the corosync to work on multicast routed
>> network? I had to create an openvpn tap link between the real nodes for
>> working. The original config with public IP-s didn't worked. Is corosync
>> equipped to cope with the multicast pim messages? Or it was a firewall
>> issue.
>
> First question, what versions of software are on each of the nodes?

Test bed nodes:

[root [at] virt ~]# corosync -v
Corosync Cluster Engine, version '1.4.2'
Copyright (c) 2006-2009 Red Hat, Inc.
[root [at] virt ~]# pacemakerd -$
Pacemaker 1.1.6-1.fc14
Written by Andrew Beekhof

[root [at] virt ~]# corosync -v;pacemakerd -$
Corosync Cluster Engine, version '1.4.2'
Copyright (c) 2006-2009 Red Hat, Inc.
Pacemaker 1.1.6-1.fc14
Written by Andrew Beekhof


Real nodes:

[root [at] ip ~]# corosync -v;pacemakerd -$
Corosync Cluster Engine, version '1.4.2'
Copyright (c) 2006-2009 Red Hat, Inc.
Pacemaker 1.1.6-1.fc14
Written by Andrew Beekhof

[root [at] et ~]# corosync -v;pacemakerd -$
Corosync Cluster Engine, version '1.4.2'
Copyright (c) 2006-2009 Red Hat, Inc.
Pacemaker 1.1.6-1.fc14
Written by Andrew Beekhof

>
> When using multicast, corosync doesn't care about "routing" the
> messages AFAIK, it relies on the network layer to do it's job. Now the
> "split-brain" you mention can take place due to network interruption,
> or due to missing or untested fencing as well.

I have created a testing environment for the cluster before going to
manage real service by cluster software. The testbed is two node on the
same machine in KVM virtualization on the same network mentioned above.
There is no routing here. Everything is in L2.

>
> Second question, do you have fencing configured?

No, and with only one channel of communication (the net) I find it is
not helpful. I thought of a third quorum node somewhere else outside the
two building. If the network goes down so badly that at lease two of the
tree node doesn't see each other the services may also down as no one
can use it.

>
> You've mentioned 2(?) nodes "5 hops away", I'm guessing they're not in
> the same datacenter. If so, did you also test the latency on the
> network between endpoints? Also can you make sure PIM routing is
> enabled on all of the "hops" along the way?
>

The real servers (ipa, eta) are not even in the same building. Its a
university campus. The network supports multicast routing I was told.
Although with mtest.tgz (simple multicast test utility) I cannot state
it is working well. This problem seems to bee local one. I ask help from
the local netadmin.

The network latency:

With drbd is uptodate:
[root [at] ip ~]# time ping eta -i .01 -q -s 1472 -c 2000
PING eta () 1472(1500) bytes of data.

--- eta ping statistics ---
2000 packets transmitted, 2000 received, 0% packet loss, time 17976ms
rtt min/avg/max/mdev = 0.483/0.492/1.350/0.031 ms

real 0m17.979s
user 0m2.004s
sys 0m14.857s

With drbd is in syncing:
[root [at] ip ~]# time ping eta -i .01 -q -s 1472 -c 2000
PING eta () 1472(1500) bytes of data.

--- eta ping statistics ---
2000 packets transmitted, 2000 received, 0% packet loss, time 17987ms
rtt min/avg/max/mdev = 0.482/6.217/9.572/1.885 ms

real 0m18.038s
user 0m0.652s
sys 0m4.708s

> Your scenario seems to be a split-site, so you may be interested in
> https://github.com/jjzhang/booth as well.

Yes it is. At least the real one. But what about the testbed, the
virtual machines on the same host? Aren't they suppose to work right as
the document guides?

Thank You anyway! I will investigate this daemons development if I can
use it.


Bence

>
> Regards,
> Dan
>
>>
>> Thanks in advance,
>> Bence
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


andrew at beekhof

Jan 29, 2012, 7:00 PM

Post #4 of 8 (1604 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

On Thu, Jan 26, 2012 at 2:08 AM, Kiss Bence <bence [at] noc> wrote:
> Hi,
>
> I am newbie to the clustering and I am trying to build a two node
> active/passive cluster based upon the documentation:
> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>
> My systems are Fedora 14, uptodate. After forming the cluster as wrote, I
> started to test it. (resources: drbd-> lvm-> fs ->group of services)
> Resources moved around, nodes rebooted and killed (first I tried it in
> virtual environment then also on real machines).
>
> After some events the two nodes ended up in a kind of state of split-brain.
> The crm_mon showed me that the other node is offline at both nodes although
> the drbd subsystem showed everything in sync and working. The network was
> not the issue (ping, tcp and udp communications were fine). Nothing changed
> from the network view.
>
> At first the rejoining took place quiet well, but some more events after it
> took longer and after more event it didn't. The network dump showed me the
> multicast packets still coming and going. At corosync (crm_node -l) the
> other node didn't appeared both on them. After trying configuring the cib
> logs was full of messages like "<the other node>: not in our membership".

That looks like a pacemaker bug.
Can you use crm_report to grab logs from about 30 minutes prior to the
first time you see this log until an hour after please?

Attach that to a bug in bugs.clusterlabs.org and i'll take a look

>
> I tried to erase the config (crm configure erase, cibadmin -E -f) but it
> worked only locally. I noticed that the pacemaker process didn't started up
> normally on the node that was booting after the other. I also tried to
> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only the
> resources are gone. It didn't help on forming a cluster without resources.
> The pacemaker process exited some 20 minutes after it started. Manual
> starting was the same.
>
> After digging into google for answers I found nothing helpful. From running
> tips I changed in the /etc/corosync/service.d/pcmk file the version to 1.1
> (this is the version of the pacemaker in this distro). I realized that the
> cluster processes were startup from corosync itself not by pacemaker. Which
> could be omitted. The cluster forming is stable after this change even after
> many many events.
>
> Now I reread the document mentioned above, and I wonder why it wrote the
> "Important notice" on page 37. What is wrong theoretically with my scenario?

Having corosync start the daemons worked well for some but not others,
thus it was unreliable.
The notice points out a major difference between the two operating
modes so that people will not be caught by surprise when pacemaker
does not start.

> Why does it working? Why didn't work the config suggested by the document?
>
> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core, 512Mb
> ram, 10G disk, 1G drbd on logical volume, physical  volume on drbd forming
> volgroup named cluster.)/node.
>
> Then on real machines. They have more cpu cores (4), more RAM (4G) and more
> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between the
> nodes 5 hops away.
>
> By the way how should one configure the corosync to work on multicast routed
> network? I had to create an openvpn tap link between the real nodes for
> working. The original config with public IP-s didn't worked. Is corosync
> equipped to cope with the multicast pim messages? Or it was a firewall
> issue.
>
> Thanks in advance,
> Bence
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


bence at noc

Feb 10, 2012, 7:46 AM

Post #5 of 8 (1553 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

Hi,

On 01/30/2012 04:00 AM, Andrew Beekhof wrote:
> On Thu, Jan 26, 2012 at 2:08 AM, Kiss Bence<bence [at] noc> wrote:
>> Hi,
>>
>> I am newbie to the clustering and I am trying to build a two node
>> active/passive cluster based upon the documentation:
>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>
>> My systems are Fedora 14, uptodate. After forming the cluster as wrote, I
>> started to test it. (resources: drbd-> lvm-> fs ->group of services)
>> Resources moved around, nodes rebooted and killed (first I tried it in
>> virtual environment then also on real machines).
>>
>> After some events the two nodes ended up in a kind of state of split-brain.
>> The crm_mon showed me that the other node is offline at both nodes although
>> the drbd subsystem showed everything in sync and working. The network was
>> not the issue (ping, tcp and udp communications were fine). Nothing changed
>> from the network view.
>>
>> At first the rejoining took place quiet well, but some more events after it
>> took longer and after more event it didn't. The network dump showed me the
>> multicast packets still coming and going. At corosync (crm_node -l) the
>> other node didn't appeared both on them. After trying configuring the cib
>> logs was full of messages like "<the other node>: not in our membership".
>
> That looks like a pacemaker bug.
> Can you use crm_report to grab logs from about 30 minutes prior to the
> first time you see this log until an hour after please?
>
> Attach that to a bug in bugs.clusterlabs.org and i'll take a look

I had created a bug report: id 5031.

The "split-brain" is lasting every time about 5 minutes. Meanwhile the
two nodes think that the other node is dead. However the drbd is working
fine, and properly disallowing the second rebooted node to go Primary.
The crm_node -l shows only the local node.

Meanwhile one of my question is answered. The multicast issue was a
local network issue. The local netadmin fixed it. Now it works.

This issue seems to me similar to what James Flatten had reported at
8-th Feb. ([Pacemaker] Question about cluster start-up in a 2 node
cluster with a node offline.)

The stonith-enabled="false" \
and no-quorum-policy="ignore"

Thanks in advance,
Bence


>
>>
>> I tried to erase the config (crm configure erase, cibadmin -E -f) but it
>> worked only locally. I noticed that the pacemaker process didn't started up
>> normally on the node that was booting after the other. I also tried to
>> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only the
>> resources are gone. It didn't help on forming a cluster without resources.
>> The pacemaker process exited some 20 minutes after it started. Manual
>> starting was the same.
>>
>> After digging into google for answers I found nothing helpful. From running
>> tips I changed in the /etc/corosync/service.d/pcmk file the version to 1.1
>> (this is the version of the pacemaker in this distro). I realized that the
>> cluster processes were startup from corosync itself not by pacemaker. Which
>> could be omitted. The cluster forming is stable after this change even after
>> many many events.
>>
>> Now I reread the document mentioned above, and I wonder why it wrote the
>> "Important notice" on page 37. What is wrong theoretically with my scenario?
>
> Having corosync start the daemons worked well for some but not others,
> thus it was unreliable.
> The notice points out a major difference between the two operating
> modes so that people will not be caught by surprise when pacemaker
> does not start.
>
>> Why does it working? Why didn't work the config suggested by the document?
>>
>> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core, 512Mb
>> ram, 10G disk, 1G drbd on logical volume, physical volume on drbd forming
>> volgroup named cluster.)/node.
>>
>> Then on real machines. They have more cpu cores (4), more RAM (4G) and more
>> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between the
>> nodes 5 hops away.
>>
>> By the way how should one configure the corosync to work on multicast routed
>> network? I had to create an openvpn tap link between the real nodes for
>> working. The original config with public IP-s didn't worked. Is corosync
>> equipped to cope with the multicast pim messages? Or it was a firewall
>> issue.
>>
>> Thanks in advance,
>> Bence
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


andrew at beekhof

Feb 12, 2012, 5:09 PM

Post #6 of 8 (1534 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

On Sat, Feb 11, 2012 at 2:46 AM, Kiss Bence <bence [at] noc> wrote:
> Hi,
>
>
> On 01/30/2012 04:00 AM, Andrew Beekhof wrote:
>>
>> On Thu, Jan 26, 2012 at 2:08 AM, Kiss Bence<bence [at] noc>  wrote:
>>>
>>> Hi,
>>>
>>> I am newbie to the clustering and I am trying to build a two node
>>> active/passive cluster based upon the documentation:
>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>
>>> My systems are Fedora 14, uptodate. After forming the cluster as wrote, I
>>> started to test it. (resources: drbd->  lvm->  fs ->group of services)
>>> Resources moved around, nodes rebooted and killed (first I tried it in
>>> virtual environment then also on real machines).
>>>
>>> After some events the two nodes ended up in a kind of state of
>>> split-brain.
>>> The crm_mon showed me that the other node is offline at both nodes
>>> although
>>> the drbd subsystem showed everything in sync and working. The network was
>>> not the issue (ping, tcp and udp communications were fine). Nothing
>>> changed
>>> from the network view.
>>>
>>> At first the rejoining took place quiet well, but some more events after
>>> it
>>> took longer and after more event it didn't. The network dump showed me
>>> the
>>> multicast packets still coming and going. At corosync (crm_node -l) the
>>> other node didn't appeared both on them. After trying configuring the cib
>>> logs was full of messages like "<the other node>: not in our membership".
>>
>>
>> That looks like a pacemaker bug.
>> Can you use crm_report to grab logs from about 30 minutes prior to the
>> first time you see this log until an hour after please?
>>
>> Attach that to a bug in bugs.clusterlabs.org and i'll take a look
>
>
> I had created a bug report: id 5031.

Perfect, I'll look there.

>
> The "split-brain" is lasting every time about 5 minutes. Meanwhile the two
> nodes think that the other node is dead. However the drbd is working fine,
> and properly disallowing the second rebooted node to go Primary. The
> crm_node -l shows only the local node.
>
> Meanwhile one of my question is answered. The multicast issue was a local
> network issue. The local netadmin fixed it. Now it works.
>
> This issue seems to me similar to what James Flatten had reported at 8-th
> Feb. ([Pacemaker] Question about cluster start-up in a 2 node cluster with a
> node offline.)
>
> The stonith-enabled="false" \
> and no-quorum-policy="ignore"
>
> Thanks in advance,
> Bence
>
>
>
>>
>>>
>>> I tried to erase the config (crm configure erase, cibadmin -E -f) but it
>>> worked only locally. I noticed that the pacemaker process didn't started
>>> up
>>> normally on the node that was booting after the other. I also tried to
>>> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only
>>> the
>>> resources are gone. It didn't help on forming a cluster without
>>> resources.
>>> The pacemaker process exited some 20 minutes after it started. Manual
>>> starting was the same.
>>>
>>> After digging into google for answers I found nothing helpful. From
>>> running
>>> tips I changed in the /etc/corosync/service.d/pcmk file the version to
>>> 1.1
>>> (this is the version of the pacemaker in this distro). I realized that
>>> the
>>> cluster processes were startup from corosync itself not by pacemaker.
>>> Which
>>> could be omitted. The cluster forming is stable after this change even
>>> after
>>> many many events.
>>>
>>> Now I reread the document mentioned above, and I wonder why it wrote the
>>> "Important notice" on page 37. What is wrong theoretically with my
>>> scenario?
>>
>>
>> Having corosync start the daemons worked well for some but not others,
>> thus it was unreliable.
>> The notice points out a major difference between the two operating
>> modes so that people will not be caught by surprise when pacemaker
>> does not start.
>>
>>> Why does it working? Why didn't work the config suggested by the
>>> document?
>>>
>>> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core,
>>> 512Mb
>>> ram, 10G disk, 1G drbd on logical volume, physical  volume on drbd
>>> forming
>>> volgroup named cluster.)/node.
>>>
>>> Then on real machines. They have more cpu cores (4), more RAM (4G) and
>>> more
>>> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between
>>> the
>>> nodes 5 hops away.
>>>
>>> By the way how should one configure the corosync to work on multicast
>>> routed
>>> network? I had to create an openvpn tap link between the real nodes for
>>> working. The original config with public IP-s didn't worked. Is corosync
>>> equipped to cope with the multicast pim messages? Or it was a firewall
>>> issue.
>>>
>>> Thanks in advance,
>>> Bence
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker [at] oss
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


bence at noc

Feb 27, 2012, 10:22 PM

Post #7 of 8 (1479 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

Hi Andrew,

did You have the time to look at the bug report? Are there anything
missing from it?

An other question. Maybe it better helps for me to understand the problem.

If a node fails, what is the expected behaviour of the pacemaker to
recover the node with stonith enabled and without it? What is expected
from the sysadmin in this procedure?

Bence


On 02/13/2012 02:09 AM, Andrew Beekhof wrote:
> On Sat, Feb 11, 2012 at 2:46 AM, Kiss Bence<bence [at] noc> wrote:
>> Hi,
>>
>>
>> On 01/30/2012 04:00 AM, Andrew Beekhof wrote:
>>>
>>> On Thu, Jan 26, 2012 at 2:08 AM, Kiss Bence<bence [at] noc> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I am newbie to the clustering and I am trying to build a two node
>>>> active/passive cluster based upon the documentation:
>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>
>>>> My systems are Fedora 14, uptodate. After forming the cluster as wrote, I
>>>> started to test it. (resources: drbd-> lvm-> fs ->group of services)
>>>> Resources moved around, nodes rebooted and killed (first I tried it in
>>>> virtual environment then also on real machines).
>>>>
>>>> After some events the two nodes ended up in a kind of state of
>>>> split-brain.
>>>> The crm_mon showed me that the other node is offline at both nodes
>>>> although
>>>> the drbd subsystem showed everything in sync and working. The network was
>>>> not the issue (ping, tcp and udp communications were fine). Nothing
>>>> changed
>>>> from the network view.
>>>>
>>>> At first the rejoining took place quiet well, but some more events after
>>>> it
>>>> took longer and after more event it didn't. The network dump showed me
>>>> the
>>>> multicast packets still coming and going. At corosync (crm_node -l) the
>>>> other node didn't appeared both on them. After trying configuring the cib
>>>> logs was full of messages like "<the other node>: not in our membership".
>>>
>>>
>>> That looks like a pacemaker bug.
>>> Can you use crm_report to grab logs from about 30 minutes prior to the
>>> first time you see this log until an hour after please?
>>>
>>> Attach that to a bug in bugs.clusterlabs.org and i'll take a look
>>
>>
>> I had created a bug report: id 5031.
>
> Perfect, I'll look there.
>
>>
>> The "split-brain" is lasting every time about 5 minutes. Meanwhile the two
>> nodes think that the other node is dead. However the drbd is working fine,
>> and properly disallowing the second rebooted node to go Primary. The
>> crm_node -l shows only the local node.
>>
>> Meanwhile one of my question is answered. The multicast issue was a local
>> network issue. The local netadmin fixed it. Now it works.
>>
>> This issue seems to me similar to what James Flatten had reported at 8-th
>> Feb. ([Pacemaker] Question about cluster start-up in a 2 node cluster with a
>> node offline.)
>>
>> The stonith-enabled="false" \
>> and no-quorum-policy="ignore"
>>
>> Thanks in advance,
>> Bence
>>
>>
>>
>>>
>>>>
>>>> I tried to erase the config (crm configure erase, cibadmin -E -f) but it
>>>> worked only locally. I noticed that the pacemaker process didn't started
>>>> up
>>>> normally on the node that was booting after the other. I also tried to
>>>> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only
>>>> the
>>>> resources are gone. It didn't help on forming a cluster without
>>>> resources.
>>>> The pacemaker process exited some 20 minutes after it started. Manual
>>>> starting was the same.
>>>>
>>>> After digging into google for answers I found nothing helpful. From
>>>> running
>>>> tips I changed in the /etc/corosync/service.d/pcmk file the version to
>>>> 1.1
>>>> (this is the version of the pacemaker in this distro). I realized that
>>>> the
>>>> cluster processes were startup from corosync itself not by pacemaker.
>>>> Which
>>>> could be omitted. The cluster forming is stable after this change even
>>>> after
>>>> many many events.
>>>>
>>>> Now I reread the document mentioned above, and I wonder why it wrote the
>>>> "Important notice" on page 37. What is wrong theoretically with my
>>>> scenario?
>>>
>>>
>>> Having corosync start the daemons worked well for some but not others,
>>> thus it was unreliable.
>>> The notice points out a major difference between the two operating
>>> modes so that people will not be caught by surprise when pacemaker
>>> does not start.
>>>
>>>> Why does it working? Why didn't work the config suggested by the
>>>> document?
>>>>
>>>> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core,
>>>> 512Mb
>>>> ram, 10G disk, 1G drbd on logical volume, physical volume on drbd
>>>> forming
>>>> volgroup named cluster.)/node.
>>>>
>>>> Then on real machines. They have more cpu cores (4), more RAM (4G) and
>>>> more
>>>> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between
>>>> the
>>>> nodes 5 hops away.
>>>>
>>>> By the way how should one configure the corosync to work on multicast
>>>> routed
>>>> network? I had to create an openvpn tap link between the real nodes for
>>>> working. The original config with public IP-s didn't worked. Is corosync
>>>> equipped to cope with the multicast pim messages? Or it was a firewall
>>>> issue.
>>>>
>>>> Thanks in advance,
>>>> Bence
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker [at] oss
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker [at] oss
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


andrew at beekhof

Feb 28, 2012, 1:22 AM

Post #8 of 8 (1502 views)
Permalink
Re: corosync vs. pacemaker 1.1 [In reply to]

On Tue, Feb 28, 2012 at 5:22 PM, Kiss Bence <bence [at] noc> wrote:
> Hi Andrew,
>
>  did You have the time to look at the bug report? Are there anything missing
> from it?

Just looking now. My queue gets pretty long sometimes.

> An other question. Maybe it better helps for me to understand the problem.
>
> If a node fails, what is the expected behaviour of the pacemaker to recover
> the node with stonith enabled and without it?

Without it we blindly start the service on the remaining node and hope
it was a real failure.
Otherwise its running in two places and your data is toast.

With it, we shoot the node and start the resource on one of the
remaining nodes.

> What is expected from the
> sysadmin in this procedure?

Nothing.

> Bence
>
>
>
> On 02/13/2012 02:09 AM, Andrew Beekhof wrote:
>>
>> On Sat, Feb 11, 2012 at 2:46 AM, Kiss Bence<bence [at] noc>  wrote:
>>>
>>> Hi,
>>>
>>>
>>> On 01/30/2012 04:00 AM, Andrew Beekhof wrote:
>>>>
>>>>
>>>> On Thu, Jan 26, 2012 at 2:08 AM, Kiss Bence<bence [at] noc>    wrote:
>>>>>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am newbie to the clustering and I am trying to build a two node
>>>>> active/passive cluster based upon the documentation:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>>
>>>>> My systems are Fedora 14, uptodate. After forming the cluster as wrote,
>>>>> I
>>>>> started to test it. (resources: drbd->    lvm->    fs ->group of
>>>>> services)
>>>>> Resources moved around, nodes rebooted and killed (first I tried it in
>>>>> virtual environment then also on real machines).
>>>>>
>>>>> After some events the two nodes ended up in a kind of state of
>>>>> split-brain.
>>>>> The crm_mon showed me that the other node is offline at both nodes
>>>>> although
>>>>> the drbd subsystem showed everything in sync and working. The network
>>>>> was
>>>>> not the issue (ping, tcp and udp communications were fine). Nothing
>>>>> changed
>>>>> from the network view.
>>>>>
>>>>> At first the rejoining took place quiet well, but some more events
>>>>> after
>>>>> it
>>>>> took longer and after more event it didn't. The network dump showed me
>>>>> the
>>>>> multicast packets still coming and going. At corosync (crm_node -l) the
>>>>> other node didn't appeared both on them. After trying configuring the
>>>>> cib
>>>>> logs was full of messages like "<the other node>: not in our
>>>>> membership".
>>>>
>>>>
>>>>
>>>> That looks like a pacemaker bug.
>>>> Can you use crm_report to grab logs from about 30 minutes prior to the
>>>> first time you see this log until an hour after please?
>>>>
>>>> Attach that to a bug in bugs.clusterlabs.org and i'll take a look
>>>
>>>
>>>
>>> I had created a bug report: id 5031.
>>
>>
>> Perfect, I'll look there.
>>
>>>
>>> The "split-brain" is lasting every time about 5 minutes. Meanwhile the
>>> two
>>> nodes think that the other node is dead. However the drbd is working
>>> fine,
>>> and properly disallowing the second rebooted node to go Primary. The
>>> crm_node -l shows only the local node.
>>>
>>> Meanwhile one of my question is answered. The multicast issue was a local
>>> network issue. The local netadmin fixed it. Now it works.
>>>
>>> This issue seems to me similar to what James Flatten had reported at 8-th
>>> Feb. ([Pacemaker] Question about cluster start-up in a 2 node cluster
>>> with a
>>> node offline.)
>>>
>>> The stonith-enabled="false" \
>>> and no-quorum-policy="ignore"
>>>
>>> Thanks in advance,
>>> Bence
>>>
>>>
>>>
>>>>
>>>>>
>>>>> I tried to erase the config (crm configure erase, cibadmin -E -f) but
>>>>> it
>>>>> worked only locally. I noticed that the pacemaker process didn't
>>>>> started
>>>>> up
>>>>> normally on the node that was booting after the other. I also tried to
>>>>> remove files from /var/lib/pengine/ and /var/lib/hearbeat/crm/ but only
>>>>> the
>>>>> resources are gone. It didn't help on forming a cluster without
>>>>> resources.
>>>>> The pacemaker process exited some 20 minutes after it started. Manual
>>>>> starting was the same.
>>>>>
>>>>> After digging into google for answers I found nothing helpful. From
>>>>> running
>>>>> tips I changed in the /etc/corosync/service.d/pcmk file the version to
>>>>> 1.1
>>>>> (this is the version of the pacemaker in this distro). I realized that
>>>>> the
>>>>> cluster processes were startup from corosync itself not by pacemaker.
>>>>> Which
>>>>> could be omitted. The cluster forming is stable after this change even
>>>>> after
>>>>> many many events.
>>>>>
>>>>> Now I reread the document mentioned above, and I wonder why it wrote
>>>>> the
>>>>> "Important notice" on page 37. What is wrong theoretically with my
>>>>> scenario?
>>>>
>>>>
>>>>
>>>> Having corosync start the daemons worked well for some but not others,
>>>> thus it was unreliable.
>>>> The notice points out a major difference between the two operating
>>>> modes so that people will not be caught by surprise when pacemaker
>>>> does not start.
>>>>
>>>>> Why does it working? Why didn't work the config suggested by the
>>>>> document?
>>>>>
>>>>> Tests were done firsth on virtual machines of a Fedora 14 (1 CPU core,
>>>>> 512Mb
>>>>> ram, 10G disk, 1G drbd on logical volume, physical  volume on drbd
>>>>> forming
>>>>> volgroup named cluster.)/node.
>>>>>
>>>>> Then on real machines. They have more cpu cores (4), more RAM (4G) and
>>>>> more
>>>>> disk (mirrored 750G), 180G drbd, and 100M garanteed routed link between
>>>>> the
>>>>> nodes 5 hops away.
>>>>>
>>>>> By the way how should one configure the corosync to work on multicast
>>>>> routed
>>>>> network? I had to create an openvpn tap link between the real nodes for
>>>>> working. The original config with public IP-s didn't worked. Is
>>>>> corosync
>>>>> equipped to cope with the multicast pim messages? Or it was a firewall
>>>>> issue.
>>>>>
>>>>> Thanks in advance,
>>>>> Bence
>>>>>
>>>>> _______________________________________________
>>>>> Pacemaker mailing list: Pacemaker [at] oss
>>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>>
>>>>> Project Home: http://www.clusterlabs.org
>>>>> Getting started:
>>>>> http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>>> Bugs: http://bugs.clusterlabs.org
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Pacemaker mailing list: Pacemaker [at] oss
>>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>>
>>>> Project Home: http://www.clusterlabs.org
>>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>>> Bugs: http://bugs.clusterlabs.org
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker [at] oss
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.