Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Users

Connecting to the Cluster Configuration from a Remote Machine

 

 

Linux-HA users RSS feed   Index | Next | Previous | View Threaded


boda2004 at gmail

Sep 4, 2009, 8:35 AM

Post #1 of 19 (3198 views)
Permalink
Connecting to the Cluster Configuration from a Remote Machine

Hi. I'm trying to enable remote connections to cluster, but with no
luck, netstat does not show those ports as opened, logs tell me
nothing as well.
I use pacemaker-1.0.5 and openais-1.0.1 on Gentoo linux.

Thanks.

Below is cluster configuration with stripped irrelevent (in my
opinion) parts.
<cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-
quorum="0" admin_epoch="0" epoch="297" num_updates="1" cib-last-
written="Fri Sep 4 17:55:36 2009" remote-tls-port="1234" remote-clear-
port="12345" dc-uuid="box2">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version" name="dc-
version" value="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7"/>
<nvpair id="cib-bootstrap-options-cluster-infrastructure"
name="cluster-infrastructure" value="openais"/>
<nvpair id="cib-bootstrap-options-expected-quorum-votes"
name="expected-quorum-votes" value="2"/>
<nvpair id="cib-bootstrap-options-stonith-enabled"
name="stonith-enabled" value="false"/>
<nvpair id="cib-bootstrap-options-last-lrm-refresh"
name="last-lrm-refresh" value="1251891300"/>
<nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-
quorum-policy" value="ignore"/>
<nvpair id="cib-bootstrap-options-default-resource-
stickiness" name="default-resource-stickiness" value="1000"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="box1" uname="box1" type="normal"/>
<node id="box2" type="normal" uname="box2"/>
</nodes>
<resources>
[...]
</resources>
<constraints>
[...]
</constraints>
<rsc_defaults/>
<op_defaults/>
</configuration>
<status>
[...]
</status>
</cib>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Sep 7, 2009, 11:26 PM

Post #2 of 19 (3092 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Fri, Sep 4, 2009 at 5:35 PM, Alexander Bodnarashik<boda2004 [at] gmail> wrote:
> Hi. I'm trying to enable remote connections to cluster, but with no
> luck, netstat does not show those ports as opened, logs tell me
> nothing as well.

Were those port values in the CIB when the cluster started? If not,
restart the cluster software.
Otherwise, check if TLS support was enabled when you built pacemaker.

> I use pacemaker-1.0.5 and openais-1.0.1 on Gentoo linux.
>
> Thanks.
>
> Below is cluster configuration with stripped irrelevent (in my
> opinion) parts.
> <cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1" have-
> quorum="0" admin_epoch="0" epoch="297" num_updates="1" cib-last-
> written="Fri Sep 4 17:55:36 2009" remote-tls-port="1234" remote-clear-
> port="12345" dc-uuid="box2">
> <configuration>
> <crm_config>
> <cluster_property_set id="cib-bootstrap-options">
> <nvpair id="cib-bootstrap-options-dc-version" name="dc-
> version" value="1.0.5-462f1569a43740667daf7b0f6b521742e9eb8fa7"/>
> <nvpair id="cib-bootstrap-options-cluster-infrastructure"
> name="cluster-infrastructure" value="openais"/>
> <nvpair id="cib-bootstrap-options-expected-quorum-votes"
> name="expected-quorum-votes" value="2"/>
> <nvpair id="cib-bootstrap-options-stonith-enabled"
> name="stonith-enabled" value="false"/>
> <nvpair id="cib-bootstrap-options-last-lrm-refresh"
> name="last-lrm-refresh" value="1251891300"/>
> <nvpair id="cib-bootstrap-options-no-quorum-policy" name="no-
> quorum-policy" value="ignore"/>
> <nvpair id="cib-bootstrap-options-default-resource-
> stickiness" name="default-resource-stickiness" value="1000"/>
> </cluster_property_set>
> </crm_config>
> <nodes>
> <node id="box1" uname="box1" type="normal"/>
> <node id="box2" type="normal" uname="box2"/>
> </nodes>
> <resources>
> [...]
> </resources>
> <constraints>
> [...]
> </constraints>
> <rsc_defaults/>
> <op_defaults/>
> </configuration>
> <status>
> [...]
> </status>
> </cib>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Sep 9, 2009, 1:30 AM

Post #3 of 19 (3073 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Sep 08, 2009, at 09:26, Andrew Beekhof wrote:

> On Fri, Sep 4, 2009 at 5:35 PM, Alexander Bodnarashik<boda2004 [at] gmail
> > wrote:
>> Hi. I'm trying to enable remote connections to cluster, but with no
>> luck, netstat does not show those ports as opened, logs tell me
>> nothing as well.
>
> Were those port values in the CIB when the cluster started? If not,
> restart the cluster software.
> Otherwise, check if TLS support was enabled when you built pacemaker.

Both port values were set before cluster started.

I didn't find tls-related options in pacemaker "./configure". But tls
was found on system during configure script run:
...
checking gnutls/gnutls.h usability... yes
checking gnutls/gnutls.h presence... yes
checking for gnutls/gnutls.h... yes
checking for security/pam_appl.h... (cached) yes
checking for pam/pam_appl.h... (cached) no
checking for libgnutls-config... /usr/bin/libgnutls-config
checking for gnutls header flags... -I/usr/include
checking for gnutls library flags... -L/usr/lib -lgnutls -lgcrypt -
lgpg-error
...

also cibadmin linked against gnutls:
ldd `which cibadmin`|grep tls
libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0xb7fc5000)
So i suppose that tls is enabled.

I'm also attaching logs, corosync config and cib.
Thanks.
Attachments: cib.xml.gz (2.63 KB)
  corosync.conf.gz (0.34 KB)
  corosync.log.gz (1.39 KB)
  messages.gz (9.30 KB)


andrew at beekhof

Sep 10, 2009, 1:05 PM

Post #4 of 19 (3063 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Strange. I'll take a look on monday (after my vacation).

On Wed, Sep 9, 2009 at 10:30 AM, Alexander
Bodnarashik<boda2004 [at] gmail> wrote:
>
> On Sep 08, 2009, at 09:26, Andrew Beekhof wrote:
>
>> On Fri, Sep 4, 2009 at 5:35 PM, Alexander Bodnarashik<boda2004 [at] gmail>
>> wrote:
>>>
>>> Hi. I'm trying to enable remote connections to cluster, but with no
>>> luck, netstat does not show those ports as opened, logs tell me
>>> nothing as well.
>>
>> Were those port values in the CIB when the cluster started? If not,
>> restart the cluster software.
>> Otherwise, check if TLS support was enabled when you built pacemaker.
>
> Both port values were set before cluster started.
>
> I didn't find tls-related options in pacemaker "./configure". But tls was
> found on system during configure script run:
> ...
> checking gnutls/gnutls.h usability... yes
> checking gnutls/gnutls.h presence... yes
> checking for gnutls/gnutls.h... yes
> checking for security/pam_appl.h... (cached) yes
> checking for pam/pam_appl.h... (cached) no
> checking for libgnutls-config... /usr/bin/libgnutls-config
> checking for gnutls header flags... -I/usr/include
> checking for gnutls library flags... -L/usr/lib -lgnutls -lgcrypt
> -lgpg-error
> ...
>
> also cibadmin linked against gnutls:
> ldd `which cibadmin`|grep tls
> libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0xb7fc5000)
> So i suppose that tls is enabled.
>
> I'm also attaching logs, corosync config and cib.
> Thanks.
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Sep 21, 2009, 3:53 AM

Post #5 of 19 (2982 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

I had a look at this, and basically I broke the initialization.
I'll fix this today for 1.0.6

On Thu, Sep 10, 2009 at 10:05 PM, Andrew Beekhof <andrew [at] beekhof> wrote:
> Strange. I'll take a look on monday (after my vacation).
>
> On Wed, Sep 9, 2009 at 10:30 AM, Alexander
> Bodnarashik<boda2004 [at] gmail> wrote:
>>
>> On Sep 08, 2009, at 09:26, Andrew Beekhof wrote:
>>
>>> On Fri, Sep 4, 2009 at 5:35 PM, Alexander Bodnarashik<boda2004 [at] gmail>
>>> wrote:
>>>>
>>>> Hi. I'm trying to enable remote connections to cluster, but with no
>>>> luck, netstat does not show those ports as opened, logs tell me
>>>> nothing as well.
>>>
>>> Were those port values in the CIB when the cluster started? If not,
>>> restart the cluster software.
>>> Otherwise, check if TLS support was enabled when you built pacemaker.
>>
>> Both port values were set before cluster started.
>>
>> I didn't find tls-related options in pacemaker "./configure". But tls was
>> found on system during configure script run:
>> ...
>> checking gnutls/gnutls.h usability... yes
>> checking gnutls/gnutls.h presence... yes
>> checking for gnutls/gnutls.h... yes
>> checking for security/pam_appl.h... (cached) yes
>> checking for pam/pam_appl.h... (cached) no
>> checking for libgnutls-config... /usr/bin/libgnutls-config
>> checking for gnutls header flags... -I/usr/include
>> checking for gnutls library flags... -L/usr/lib -lgnutls -lgcrypt
>> -lgpg-error
>> ...
>>
>> also cibadmin linked against gnutls:
>> ldd `which cibadmin`|grep tls
>> libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0xb7fc5000)
>> So i suppose that tls is enabled.
>>
>> I'm also attaching logs, corosync config and cib.
>> Thanks.
>>
>>
>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>>
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Sep 28, 2009, 8:38 AM

Post #6 of 19 (2924 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Thanks for the fix :)
I've checked out http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/05c8b63cbca7
Now ports are open.

I've encountered other problem though.
I have 2 boxes in cluster - box1 and box2. Third box, not in cluster,
is named farm.
All of them are running Gentoo. Cluster stack - openais-1.0.1

Trying to issue cibadmin -Q from farm:
1234 - plain port
> CIB_server=box1.cluster CIB_port=1234 cibadmin -Q
> Password:
> cibadmin: Connection to box1.cluster:1234 failed:
> Signon to CIB failed:
> Init failed, could not perform requested operations
and exits immediately

12345 - tls port
> CIB_server=box1.cluster CIB_port=12345 cibadmin -Q
> Password:
>

and it freezes. In logs on box1 i can see following:
> Sep 28 18:03:30 box1 cib: [3342]: ERROR: crm_xml_err: XML Error:
> Entity: line 1: parsererror : Start tag expected, '<' not found
> Sep 28 18:03:30 box1 cib: [3342]: ERROR: crm_xml_err: XML Error:
> Sep 28 18:03:30 box1 cib: [3342]: ERROR: crm_xml_err: XML Error: ^
> Sep 28 18:03:30 box1 cib: [3342]: WARN: string2xml: Parsing failed
> (domain=1, level=3, code=4): Start tag expected, '<' not found
> Sep 28 18:03:30 box1 cib: [3342]: ERROR: string2xml: Couldn't parse
> 3 chars:
> Sep 28 18:03:30 box1 cib: [3342]: ERROR: cib_recv_remote_msg:
> Couldn't parse: ''

After that i'm unable to run on box1 neither crm_mon (writes
Attempting connection to the cluster...) nor cibadmin -Q - it waits
for a while and then writes following:
> Signon to CIB failed: reply failed
> Init failed, could not perform requested operations

On box2 crm_mon runs, but it doesn't reflect changes in cluster.
running cibadmin -Q waits for a while, then shows following:
> Call cib_query failed (-41): Remote node did not respond
> <null>

Finally, in a few minutes i've found errors in logs (i think they are
caused by my attempt to connect to cluster remotely), so attaching.

Thanks.

On Sep 21, 2009, at 13:53, Andrew Beekhof wrote:

> I had a look at this, and basically I broke the initialization.
> I'll fix this today for 1.0.6
>
> On Thu, Sep 10, 2009 at 10:05 PM, Andrew Beekhof
> <andrew [at] beekhof> wrote:
>> Strange. I'll take a look on monday (after my vacation).
>>
>> On Wed, Sep 9, 2009 at 10:30 AM, Alexander
>> Bodnarashik<boda2004 [at] gmail> wrote:
>>>
>>> On Sep 08, 2009, at 09:26, Andrew Beekhof wrote:
>>>
>>>> On Fri, Sep 4, 2009 at 5:35 PM, Alexander Bodnarashik<boda2004 [at] gmail
>>>> >
>>>> wrote:
>>>>>
>>>>> Hi. I'm trying to enable remote connections to cluster, but with
>>>>> no
>>>>> luck, netstat does not show those ports as opened, logs tell me
>>>>> nothing as well.
>>>>
>>>> Were those port values in the CIB when the cluster started? If
>>>> not,
>>>> restart the cluster software.
>>>> Otherwise, check if TLS support was enabled when you built
>>>> pacemaker.
>>>
>>> Both port values were set before cluster started.
>>>
>>> I didn't find tls-related options in pacemaker "./configure". But
>>> tls was
>>> found on system during configure script run:
>>> ...
>>> checking gnutls/gnutls.h usability... yes
>>> checking gnutls/gnutls.h presence... yes
>>> checking for gnutls/gnutls.h... yes
>>> checking for security/pam_appl.h... (cached) yes
>>> checking for pam/pam_appl.h... (cached) no
>>> checking for libgnutls-config... /usr/bin/libgnutls-config
>>> checking for gnutls header flags... -I/usr/include
>>> checking for gnutls library flags... -L/usr/lib -lgnutls -lgcrypt
>>> -lgpg-error
>>> ...
>>>
>>> also cibadmin linked against gnutls:
>>> ldd `which cibadmin`|grep tls
>>> libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0xb7fc5000)
>>> So i suppose that tls is enabled.
>>>
>>> I'm also attaching logs, corosync config and cib.
>>> Thanks.
>>>
>>>
>>>
>>> _______________________________________________
>>> Linux-HA mailing list
>>> Linux-HA [at] lists
>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>> See also: http://linux-ha.org/ReportingProblems
>>>
>>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
Attachments: log.txt (6.05 KB)


andrew at beekhof

Oct 6, 2009, 11:35 AM

Post #7 of 19 (2848 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Mon, Sep 28, 2009 at 5:38 PM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
> Thanks for the fix :)
> I've checked out
> http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/05c8b63cbca7
> Now ports are open.
>
> I've encountered other problem though.
> I have 2 boxes in cluster - box1 and box2. Third box, not in cluster, is
> named farm.
> All of them are running Gentoo.

All have the same version of pacemaker?

> Cluster stack - openais-1.0.1
>
> Trying to issue cibadmin -Q from farm:
> 1234 - plain port
>>
>> CIB_server=box1.cluster CIB_port=1234 cibadmin -Q

So a couple of things here (that you couldn't possibly be expected to
know, sorry, i forgot to mention them at the time)...

You need to set CIB_user to the user than the remote node runs the CIB
as (eg. hacluster)
For plaintext connections, you need to set CIB_encrypted=false

Actually that first one needs to be the default (since non-root
daemons can only do PAM authentication for the user they're running
as).

>> Password:
>> cibadmin: Connection to box1.cluster:1234 failed:
>> Signon to CIB failed:
>> Init failed, could not perform requested operations
>
> and exits immediately
>
> 12345 - tls port
>>
>> CIB_server=box1.cluster CIB_port=12345 cibadmin -Q
>> Password:
>>
>
> and it freezes. In logs on box1 i can see following:
>>
>> Sep 28 18:03:30 box1 cib: [3342]: ERROR: crm_xml_err: XML Error: Entity:
>> line 1: parsererror : Start tag expected, '<' not found
>> Sep 28 18:03:30 box1 cib: [3342]: ERROR: crm_xml_err: XML Error:
>> Sep 28 18:03:30 box1 cib: [3342]: ERROR: crm_xml_err: XML Error: ^
>> Sep 28 18:03:30 box1 cib: [3342]: WARN: string2xml: Parsing failed
>> (domain=1, level=3, code=4): Start tag expected, '<' not found
>> Sep 28 18:03:30 box1 cib: [3342]: ERROR: string2xml: Couldn't parse 3
>> chars:
>> Sep 28 18:03:30 box1 cib: [3342]: ERROR: cib_recv_remote_msg: Couldn't
>> parse: ''
>
> After that i'm unable to run on box1 neither crm_mon (writes Attempting
> connection to the cluster...) nor cibadmin -Q - it waits for a while and
> then writes following:
>>
>> Signon to CIB failed: reply failed
>> Init failed, could not perform requested operations

Thats very disturbing.
Can you try running the CIB under valgrind to see if it reports
anything or interest?

export HA_VALGRIND_ENABLED=cib
export VALGRIND_OPTS="--log-file=/tmp/pacemaker-%p.valgrind
--leak-check=full --show-reachable=yes --trace-children=no
--num-callers=25"

install valgrind
then start the cluster

>
> On box2 crm_mon runs, but it doesn't reflect changes in cluster. running
> cibadmin -Q waits for a while, then shows following:
>>
>> Call cib_query failed (-41): Remote node did not respond
>> <null>
>
> Finally, in a few minutes i've found errors in logs (i think they are caused
> by my attempt to connect to cluster remotely), so attaching.
>
> Thanks.
>
> On Sep 21, 2009, at 13:53, Andrew Beekhof wrote:
>
>> I had a look at this, and basically I broke the initialization.
>> I'll fix this today for 1.0.6
>>
>> On Thu, Sep 10, 2009 at 10:05 PM, Andrew Beekhof <andrew [at] beekhof>
>> wrote:
>>>
>>> Strange. I'll take a look on monday (after my vacation).
>>>
>>> On Wed, Sep 9, 2009 at 10:30 AM, Alexander
>>> Bodnarashik<boda2004 [at] gmail> wrote:
>>>>
>>>> On Sep 08, 2009, at 09:26, Andrew Beekhof wrote:
>>>>
>>>>> On Fri, Sep 4, 2009 at 5:35 PM, Alexander
>>>>> Bodnarashik<boda2004 [at] gmail>
>>>>> wrote:
>>>>>>
>>>>>> Hi. I'm trying to enable remote connections to cluster, but with no
>>>>>> luck, netstat does not show those ports as opened, logs tell me
>>>>>> nothing as well.
>>>>>
>>>>> Were those port values in the CIB when the cluster started? If not,
>>>>> restart the cluster software.
>>>>> Otherwise, check if TLS support was enabled when you built pacemaker.
>>>>
>>>> Both port values were set before cluster started.
>>>>
>>>> I didn't find tls-related options in pacemaker "./configure". But tls
>>>> was
>>>> found on system during configure script run:
>>>> ...
>>>> checking gnutls/gnutls.h usability... yes
>>>> checking gnutls/gnutls.h presence... yes
>>>> checking for gnutls/gnutls.h... yes
>>>> checking for security/pam_appl.h... (cached) yes
>>>> checking for pam/pam_appl.h... (cached) no
>>>> checking for libgnutls-config... /usr/bin/libgnutls-config
>>>> checking for gnutls header flags... -I/usr/include
>>>> checking for gnutls library flags... -L/usr/lib -lgnutls -lgcrypt
>>>> -lgpg-error
>>>> ...
>>>>
>>>> also cibadmin linked against gnutls:
>>>> ldd `which cibadmin`|grep tls
>>>> libgnutls.so.26 => /usr/lib/libgnutls.so.26 (0xb7fc5000)
>>>> So i suppose that tls is enabled.
>>>>
>>>> I'm also attaching logs, corosync config and cib.
>>>> Thanks.
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Linux-HA mailing list
>>>> Linux-HA [at] lists
>>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>>>> See also: http://linux-ha.org/ReportingProblems
>>>>
>>>
>> _______________________________________________
>> Linux-HA mailing list
>> Linux-HA [at] lists
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha
>> See also: http://linux-ha.org/ReportingProblems
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Oct 7, 2009, 2:01 AM

Post #8 of 19 (2838 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Current cluster configuration (box1 started only):
> box1 ~ # crm_mon -1
>
>
> ============
> Last updated: Wed Oct 7 10:19:57 2009
> Stack: openais
> Current DC: box1 - partition WITHOUT quorum
> Version: 1.0.5-05c8b63cbca7ce95182bb41881b3c5677f20bd5c
> 3 Nodes configured, 3 expected votes
> 3 Resources configured.
> ============
>
> Online: [ box1 ]
> OFFLINE: [ box2 fc12-node1 ]
>
> Master/Slave Set: ms_drbd
> Masters: [ box1 ]
> Stopped: [ drbd:1 ]
> Clone Set: cl_pingd
> Started: [ box1 ]
> Stopped: [ pingd:1 pingd:2 ]
> Resource Group: mysql_service_group
> fs_r0 (ocf::heartbeat:Filesystem): Started box1
> ip_mysql (ocf::heartbeat:IPaddr2): Started box1
> mysql (ocf::heartbeat:mysql): Started box1

node not in cluster: farm
> farm ~ # cibadmin -$
> cibadmin 1.0.5 for OpenAIS (Build:
> 05c8b63cbca7ce95182bb41881b3c5677f20bd5c)
>
> Written by Andrew Beekhof


I run on box1:
> box1 ~ # export HA_VALGRIND_ENABLED=cib
> box1 ~ # export VALGRIND_OPTS="--log-file=/tmp/pacemaker-%p.valgrind
> --leak-check=full --show-reachable=yes --trace-children=no --num-
> callers=25"
> box1 ~ # aisexec

on farm:
> farm ~ # CIB_server=box1.cluster CIB_port=1234 CIB_user=hacluster
> cibadmin -Q
> Password:
> cibadmin: Connection to box1.cluster:1234 failed:
> Signon to CIB failed:
> Init failed, could not perform requested operations
> farm ~ # CIB_server=box1.cluster CIB_port=12345 CIB_user=hacluster
> CIB_encrypted=false cibadmin -Q
> Password:
> cibadmin: Connection to box1.cluster:12345 failed:
> Signon to CIB failed:
> Init failed, could not perform requested operations
> farm ~ #
Despite i got no results, cluster is operational, crm_mon on box1
works fine.
each attempt gave following in logs on box1:
> Oct 7 11:43:01 box1 cib: [3458]: info: log_data_element:
> cib_remote_listen: Login: <cib_command op="authenticate"
> user="hacluster" password="*****" hidden="password" />
> Oct 7 11:43:01 box1 cib: [3458]: ERROR: cib_remote_listen: User is
> not a member of the required group
So seems something wrong with hacluster user group
> box1 pc # id hacluster
> uid=65(hacluster) gid=65(haclient) groups=65(haclient)
> box1 pc # getent passwd hacluster
> hacluster:x:65:65:added by portage for cluster-glue:/var/lib/
> heartbeat:/sbin/nologin


Attempt to connect to plain port without CIB_encrypted=false causes
cluster malfunction though
> Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error:
> Entity: line 1: parsererror : Start tag expected, '<' not found
> Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error:
> Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error: ^
> Oct 7 11:50:18 box1 cib: [3458]: WARN: string2xml: Parsing failed
> (domain=1, level=3, code=4): Start tag expected, '<' not found
> Oct 7 11:50:18 box1 cib: [3458]: ERROR: string2xml: Couldn't parse
> 3 chars:
> Oct 7 11:50:18 box1 cib: [3458]: ERROR: cib_recv_remote_msg:
> Couldn't parse: ''


Attaching files, which may be useful.



On Oct 06, 2009, at 21:35, Andrew Beekhof wrote:

> On Mon, Sep 28, 2009 at 5:38 PM, Alexander Bodnarashik
> <boda2004 [at] gmail> wrote:
>> Thanks for the fix :)
>> I've checked out
>> http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/05c8b63cbca7
>> Now ports are open.
>>
>> I've encountered other problem though.
>> I have 2 boxes in cluster - box1 and box2. Third box, not in
>> cluster, is
>> named farm.
>> All of them are running Gentoo.
>
> All have the same version of pacemaker?
>
>> Cluster stack - openais-1.0.1
>>
>> Trying to issue cibadmin -Q from farm:
>> 1234 - plain port
>>>
>>> CIB_server=box1.cluster CIB_port=1234 cibadmin -Q
>
> So a couple of things here (that you couldn't possibly be expected to
> know, sorry, i forgot to mention them at the time)...
>
> You need to set CIB_user to the user than the remote node runs the CIB
> as (eg. hacluster)
> For plaintext connections, you need to set CIB_encrypted=false
>
> Actually that first one needs to be the default (since non-root
> daemons can only do PAM authentication for the user they're running
> as).



>> After that i'm unable to run on box1 neither crm_mon (writes
>> Attempting
>> connection to the cluster...) nor cibadmin -Q - it waits for a
>> while and
>> then writes following:
>>>
>>> Signon to CIB failed: reply failed
>>> Init failed, could not perform requested operations
>
> Thats very disturbing.
> Can you try running the CIB under valgrind to see if it reports
> anything or interest?
>
> export HA_VALGRIND_ENABLED=cib
> export VALGRIND_OPTS="--log-file=/tmp/pacemaker-%p.valgrind
> --leak-check=full --show-reachable=yes --trace-children=no
> --num-callers=25"
>
> install valgrind
> then start the cluster
Attachments: cibadmin-q.txt.gz (2.72 KB)
  corosync.conf.gz (0.34 KB)
  corosync.log.gz (1.47 KB)
  messages.gz (9.78 KB)
  pacemaker-3458.valgrind.gz (0.79 KB)
  ps-ax.txt.gz (0.52 KB)


andrew at beekhof

Oct 21, 2009, 1:38 AM

Post #9 of 19 (2655 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Wed, Oct 7, 2009 at 11:01 AM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
> Current cluster configuration (box1 started only):
>> Oct 7 11:43:01 box1 cib: [3458]: ERROR: cib_remote_listen: User is not a
>> member of the required group
>
> So seems something wrong with hacluster user group

You may need to explicitly add it to the group in /etc/groups
Thats what I needed to do here.

> Attempt to connect to plain port without CIB_encrypted=false causes cluster
> malfunction though
>>
>> Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error: Entity:
>> line 1: parsererror : Start tag expected, '<' not found
>> Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error:
>> Oct 7 11:50:18 box1 cib: [3458]: ERROR: crm_xml_err: XML Error: ^
>> Oct 7 11:50:18 box1 cib: [3458]: WARN: string2xml: Parsing failed
>> (domain=1, level=3, code=4): Start tag expected, '<' not found
>> Oct 7 11:50:18 box1 cib: [3458]: ERROR: string2xml: Couldn't parse 3
>> chars:
>> Oct 7 11:50:18 box1 cib: [3458]: ERROR: cib_recv_remote_msg: Couldn't
>> parse: ''

Finally tracked this one down:
http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f8a5c5056dfe


[root [at] pcmk- ~]# CIB_user=hacluster CIB_server=pcmk-2.beekhof.net
CIB_port=9234 CIB_encrypted=0 cibadmin -Q
Password:
cibadmin: Opened connection to pcmk-2.beekhof.net:9234
<cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1"
have-quorum="0" admin_epoch="1" epoch="15" num_updates="0"
remote-clear-port="9234" remote-tls-port="9235" cib-last-written="Wed
Oct 21 09:28:57 2009">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version"
name="dc-version"
value="1.0.5-38cd629e5c3cc762918900ff50e576ac8f1a3988"/>
<nvpair id="cib-bootstrap-options-cluster-infrastructure"
name="cluster-infrastructure" value="openais"/>
<nvpair id="cib-bootstrap-options-expected-quorum-votes"
name="expected-quorum-votes" value="2"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="pcmk-2.beekhof.net" uname="pcmk-2.beekhof.net" type="normal"/>
<node id="pcmk-1.beekhof.net" uname="pcmk-1.beekhof.net" type="normal"/>
</nodes>
<resources/>
<constraints/>
</configuration>
<status/>
</cib>
[root [at] pcmk- ~]# CIB_user=hacluster CIB_server=pcmk-2.beekhof.net
CIB_port=9235 CIB_encrypted=1 cibadmin -Q
Password:
cibadmin: Opened connection to pcmk-2.beekhof.net:9235
<cib validate-with="pacemaker-1.0" crm_feature_set="3.0.1"
have-quorum="0" admin_epoch="1" epoch="15" num_updates="0"
remote-clear-port="9234" remote-tls-port="9235" cib-last-written="Wed
Oct 21 09:28:57 2009">
<configuration>
<crm_config>
<cluster_property_set id="cib-bootstrap-options">
<nvpair id="cib-bootstrap-options-dc-version"
name="dc-version"
value="1.0.5-38cd629e5c3cc762918900ff50e576ac8f1a3988"/>
<nvpair id="cib-bootstrap-options-cluster-infrastructure"
name="cluster-infrastructure" value="openais"/>
<nvpair id="cib-bootstrap-options-expected-quorum-votes"
name="expected-quorum-votes" value="2"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="pcmk-2.beekhof.net" uname="pcmk-2.beekhof.net" type="normal"/>
<node id="pcmk-1.beekhof.net" uname="pcmk-1.beekhof.net" type="normal"/>
</nodes>
<resources/>
<constraints/>
</configuration>
<status/>
</cib>
[root [at] pcmk- ~]#
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Oct 21, 2009, 2:54 AM

Post #10 of 19 (2650 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Oct 21, 2009, at 11:38, Andrew Beekhof wrote:

> On Wed, Oct 7, 2009 at 11:01 AM, Alexander Bodnarashik
> <boda2004 [at] gmail> wrote:
>> Current cluster configuration (box1 started only):
>>> Oct 7 11:43:01 box1 cib: [3458]: ERROR: cib_remote_listen: User
>>> is not a
>>> member of the required group
>>
>> So seems something wrong with hacluster user group
>
> You may need to explicitly add it to the group in /etc/groups
> Thats what I needed to do here.

I've added hacluster to haclient group in /etc/passwd explicitly, also
had to change /sbin/nologin to /bin/bash, to be able to connect to
cluster remotely.

As far as i can see tls works fine (despite it produces some tls-
related errors in logs), and plain text connection produces short
response (configuration section lacks most of data, also no status
section at all) - attached logs on box1 and console output on farm.
farm - server outside cluster, box1 - cluster node.

Thank you.

_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Oct 21, 2009, 3:23 AM

Post #11 of 19 (2659 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Wed, Oct 21, 2009 at 11:54 AM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
>
> On Oct 21, 2009, at 11:38, Andrew Beekhof wrote:
>
>> On Wed, Oct 7, 2009 at 11:01 AM, Alexander Bodnarashik
>> <boda2004 [at] gmail> wrote:
>>> Current cluster configuration (box1 started only):
>>>> Oct 7 11:43:01 box1 cib: [3458]: ERROR: cib_remote_listen: User
>>>> is not a
>>>> member of the required group
>>>
>>> So seems something wrong with hacluster user group
>>
>> You may need to explicitly add it to the group in /etc/groups
>> Thats what I needed to do here.
>
> I've added hacluster to haclient group in /etc/passwd explicitly, also
> had to change /sbin/nologin to /bin/bash, to be able to connect to
> cluster remotely.
>
> As far as i can see tls works fine (despite it produces some tls-
> related errors in logs), and plain text connection produces short
> response

is this with the patch i mentioned in the last email?

> (configuration section lacks most of data, also no status
> section at all) - attached logs

nope :-)

> on box1 and console output on farm.
> farm - server outside cluster, box1 - cluster node.
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Oct 21, 2009, 4:21 AM

Post #12 of 19 (2653 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Oct 21, 2009, at 13:23, Andrew Beekhof wrote:
>>
>> As far as i can see tls works fine (despite it produces some tls-
>> related errors in logs), and plain text connection produces short
>> response
>
> is this with the patch i mentioned in the last email?
>
Yes.
Sorry, forget to mention that:
> farm ~ # cibadmin --version
> cibadmin 1.0.5 for OpenAIS (Build:
> f8a5c5056dfec20231398a90490a6de034df1f1e)
>
> Written by Andrew Beekhof

> box1 ~ # cibadmin --version
> cibadmin 1.0.5 for OpenAIS (Build:
> f8a5c5056dfec20231398a90490a6de034df1f1e)
>
> Written by Andrew Beekhof

_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Oct 28, 2009, 1:39 PM

Post #13 of 19 (2528 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

You still forgot the attachment :-)

On Wed, Oct 21, 2009 at 1:21 PM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
>
> On Oct 21, 2009, at 13:23, Andrew Beekhof wrote:
>>>
>>> As far as i can see tls works fine (despite it produces some tls-
>>> related errors in logs), and plain text connection produces short
>>> response
>>
>> is this with the patch i mentioned in the last email?
>>
> Yes.
> Sorry, forget to mention that:
>> farm ~ # cibadmin --version
>> cibadmin 1.0.5 for OpenAIS (Build:
>> f8a5c5056dfec20231398a90490a6de034df1f1e)
>>
>> Written by Andrew Beekhof
>
>> box1 ~ # cibadmin --version
>> cibadmin 1.0.5 for OpenAIS (Build:
>> f8a5c5056dfec20231398a90490a6de034df1f1e)
>>
>> Written by Andrew Beekhof
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Oct 29, 2009, 1:52 AM

Post #14 of 19 (2516 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Strange. I can see my email in sent folder with attachment log.txt.
Reattaching gzipped file (in case txt files aren't allowed)

btw: http://lists.linux-ha.org/mailman/listinfo/linux-ha gives me 404
Not Found.

On Oct 28, 2009, at 22:39, Andrew Beekhof wrote:

> You still forgot the attachment :-)
>
Attachments: log.txt.gz (1.57 KB)


andrew at beekhof

Nov 5, 2009, 3:55 AM

Post #15 of 19 (2374 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Thu, Oct 29, 2009 at 9:52 AM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
> Strange. I can see my email in sent folder with attachment log.txt.
> Reattaching gzipped file (in case txt files aren't allowed)
>
> btw: http://lists.linux-ha.org/mailman/listinfo/linux-ha gives me 404 Not
> Found.

I've no access to do anything about that, but I'll have a look at the
attachment :-)

>
> On Oct 28, 2009, at 22:39, Andrew Beekhof wrote:
>
>> You still forgot the attachment :-)
>>
>
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


andrew at beekhof

Nov 5, 2009, 4:01 AM

Post #16 of 19 (2373 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

On Thu, Oct 29, 2009 at 9:52 AM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
> Strange. I can see my email in sent folder with attachment log.txt.
> Reattaching gzipped file (in case txt files aren't allowed)

It doesn't look like the CIB is running... nothing in the logs and
corosync doesn't see it stop.

Btw. You really need to get the latest code if you're using corosync.

Minimum versions are corosync 1.1.2 and pacemaker 1.0.6
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Nov 5, 2009, 6:36 AM

Post #17 of 19 (2371 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Configuration: farm - server, outside cluster. box1 - cluster node
(online). box2 and fc12-node1 are shut down.

I've updated pacemaker and corosync to the latest on farm and box1:
> box1 ~ (18:23:15) # cibadmin -$
> cibadmin 1.0.6 for OpenAIS (Build:
> 73a6626a39bd4586fb555404a79fc21ce8228d8b)
>
> Written by Andrew Beekhof
> box1 ~ (18:23:20) # corosync -v
> Corosync Cluster Engine, version '1.1.2' SVN revision 'exported'
> Copyright (c) 2006-2009 Red Hat, Inc.

Plain text connection seems to fail as it did before http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f8a5c5056dfe
> farm ~ (18:12:55) # CIB_server=box1.cluster CIB_port=12345
> CIB_user=hacluster CIB_encrypted=false cibadmin -Q
> Password:
> cibadmin: Opened connection to box1.cluster:12345
> Call cib_query failed (-41): Remote node did not respond
> <null>

ssl works fine, some errors logged on box1 though:
> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error
> receiving message: -9: Success (0)
> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg: Empty
> reply
> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error
> receiving message: -10: Resource temporarily unavailable (11)
> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg: Empty
> reply
> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error
> receiving message: -9: Resource temporarily unavailable (11)
> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg: Empty
> reply

Attaching logs, configs and console output.
Thanks.



On Nov 05, 2009, at 14:01, Andrew Beekhof wrote:

> On Thu, Oct 29, 2009 at 9:52 AM, Alexander Bodnarashik
> <boda2004 [at] gmail> wrote:
>> Strange. I can see my email in sent folder with attachment log.txt.
>> Reattaching gzipped file (in case txt files aren't allowed)
>
> It doesn't look like the CIB is running... nothing in the logs and
> corosync doesn't see it stop.
>
> Btw. You really need to get the latest code if you're using corosync.
>
> Minimum versions are corosync 1.1.2 and pacemaker 1.0.6
Attachments: cibadmin-Q.xml.gz (2.68 KB)
  corosync.log.gz (1.49 KB)
  farm_console.txt.gz (2.92 KB)
  messages.gz (9.31 KB)
  corosync.conf.gz (0.34 KB)


andrew at beekhof

Nov 19, 2009, 11:30 AM

Post #18 of 19 (2084 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Fixed!

http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/971d8989e9f0

Sorry for the delay. Testing with a larger config made the problem
pretty obvious.

On Thu, Nov 5, 2009 at 3:36 PM, Alexander Bodnarashik
<boda2004 [at] gmail> wrote:
> Configuration: farm - server, outside cluster. box1 - cluster node (online).
> box2 and fc12-node1 are shut down.
>
> I've updated pacemaker and corosync to the latest on farm and box1:
>>
>> box1 ~ (18:23:15) # cibadmin -$
>> cibadmin 1.0.6 for OpenAIS (Build:
>> 73a6626a39bd4586fb555404a79fc21ce8228d8b)
>>
>> Written by Andrew Beekhof
>> box1 ~ (18:23:20) # corosync -v
>> Corosync Cluster Engine, version '1.1.2' SVN revision 'exported'
>> Copyright (c) 2006-2009 Red Hat, Inc.
>
> Plain text connection seems to fail as it did before
> http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f8a5c5056dfe
>>
>> farm ~ (18:12:55) # CIB_server=box1.cluster CIB_port=12345
>> CIB_user=hacluster CIB_encrypted=false cibadmin -Q
>> Password:
>> cibadmin: Opened connection to box1.cluster:12345
>> Call cib_query failed (-41): Remote node did not respond
>> <null>
>
> ssl works fine, some errors logged on box1 though:
>>
>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error receiving
>> message: -9: Success (0)
>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg: Empty reply
>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error receiving
>> message: -10: Resource temporarily unavailable (11)
>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg: Empty reply
>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error receiving
>> message: -9: Resource temporarily unavailable (11)
>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg: Empty reply
>
> Attaching logs, configs and console output.
> Thanks.
>
>
>
> On Nov 05, 2009, at 14:01, Andrew Beekhof wrote:
>
>> On Thu, Oct 29, 2009 at 9:52 AM, Alexander Bodnarashik
>> <boda2004 [at] gmail> wrote:
>>>
>>> Strange. I can see my email in sent folder with attachment log.txt.
>>> Reattaching gzipped file (in case txt files aren't allowed)
>>
>> It doesn't look like the CIB is running... nothing in the logs and
>> corosync doesn't see it stop.
>>
>> Btw. You really need to get the latest code if you're using corosync.
>>
>> Minimum versions are corosync 1.1.2 and pacemaker 1.0.6
>
>
>
> _______________________________________________
> Linux-HA mailing list
> Linux-HA [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha
> See also: http://linux-ha.org/ReportingProblems
>
_______________________________________________
Linux-HA mailing list
Linux-HA [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems


boda2004 at gmail

Nov 20, 2009, 2:40 AM

Post #19 of 19 (2070 views)
Permalink
Re: Connecting to the Cluster Configuration from a Remote Machine [In reply to]

Hi.
Unfortunately i'm still able to reproduce plain text connection error.

> farm ~ (14:22:51) # CIB_server=box1.cluster CIB_port=12345
> CIB_user=hacluster CIB_encrypted=false cibadmin -Q
> Password:
> cibadmin: Opened connection to box1.cluster:12345
> Call cib_query failed (-41): Remote node did not respond
> <null>
> farm ~ (14:23:37) #

Software versions:

> farm ~ (14:23:37) # cibadmin -$
> cibadmin 1.0.6 for OpenAIS (Build:
> f7a8250d23fce1fc596aa09bf6b00c126253a498)
>
> Written by Andrew Beekhof

> box1 ~ (14:23:42) # cibadmin -$
> cibadmin 1.0.6 for OpenAIS (Build:
> f7a8250d23fce1fc596aa09bf6b00c126253a498)
>
> Written by Andrew Beekhof

> farm ~ (14:26:35) # aisexec -v
> Corosync Cluster Engine, version '1.1.2' SVN revision 'exported'
> Copyright (c) 2006-2009 Red Hat, Inc.

> box1 ~ (14:27:21) # aisexec -v
> Corosync Cluster Engine, version '1.1.2' SVN revision 'exported'
> Copyright (c) 2006-2009 Red Hat, Inc.

Cluster configuration remains unchanged.
Attaching logs and console output on farm.

On Nov 19, 2009, at 21:30, Andrew Beekhof wrote:

> Fixed!
>
> http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/971d8989e9f0
>
> Sorry for the delay. Testing with a larger config made the problem
> pretty obvious.
>
> On Thu, Nov 5, 2009 at 3:36 PM, Alexander Bodnarashik
> <boda2004 [at] gmail> wrote:
>> Configuration: farm - server, outside cluster. box1 - cluster node
>> (online).
>> box2 and fc12-node1 are shut down.
>>
>> I've updated pacemaker and corosync to the latest on farm and box1:
>>>
>>> box1 ~ (18:23:15) # cibadmin -$
>>> cibadmin 1.0.6 for OpenAIS (Build:
>>> 73a6626a39bd4586fb555404a79fc21ce8228d8b)
>>>
>>> Written by Andrew Beekhof
>>> box1 ~ (18:23:20) # corosync -v
>>> Corosync Cluster Engine, version '1.1.2' SVN revision 'exported'
>>> Copyright (c) 2006-2009 Red Hat, Inc.
>>
>> Plain text connection seems to fail as it did before
>> http://hg.clusterlabs.org/pacemaker/stable-1.0/rev/f8a5c5056dfe
>>>
>>> farm ~ (18:12:55) # CIB_server=box1.cluster CIB_port=12345
>>> CIB_user=hacluster CIB_encrypted=false cibadmin -Q
>>> Password:
>>> cibadmin: Opened connection to box1.cluster:12345
>>> Call cib_query failed (-41): Remote node did not respond
>>> <null>
>>
>> ssl works fine, some errors logged on box1 though:
>>>
>>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error
>>> receiving
>>> message: -9: Success (0)
>>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg:
>>> Empty reply
>>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error
>>> receiving
>>> message: -10: Resource temporarily unavailable (11)
>>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg:
>>> Empty reply
>>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_tls: Error
>>> receiving
>>> message: -9: Resource temporarily unavailable (11)
>>> Nov 5 18:13:44 box1 cib: [3644]: ERROR: cib_recv_remote_msg:
>>> Empty reply
>>
>> Attaching logs, configs and console output.
>> Thanks.
>>
>>
>>
>> On Nov 05, 2009, at 14:01, Andrew Beekhof wrote:
>>
>>> On Thu, Oct 29, 2009 at 9:52 AM, Alexander Bodnarashik
>>> <boda2004 [at] gmail> wrote:
>>>>
>>>> Strange. I can see my email in sent folder with attachment log.txt.
>>>> Reattaching gzipped file (in case txt files aren't allowed)
>>>
>>> It doesn't look like the CIB is running... nothing in the logs and
>>> corosync doesn't see it stop.
>>>
>>> Btw. You really need to get the latest code if you're using
>>> corosync.
>>>
>>> Minimum versions are corosync 1.1.2 and pacemaker 1.0.6
Attachments: corosync.log.gz (1.50 KB)
  farm.txt.gz (2.86 KB)
  messages.gz (9.54 KB)

Linux-HA users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.