Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

crm mon on Node-2 shows both Node-1 & Node-2 as online but crm mon on Node-1 shows Node-2 as offline

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


parshvi.17 at gmail

Apr 19, 2012, 5:56 AM

Post #1 of 5 (1252 views)
Permalink
crm mon on Node-2 shows both Node-1 & Node-2 as online but crm mon on Node-1 shows Node-2 as offline

1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
a. Use case:
i. Two nodes in a cluster (Call them Node-1 and Node-2)
ii. One interface configured in corosync.conf for its heartbeat or
messaging. Eg. Bind net addr : 192.168.10.0
iii. Another interface configured in /etc/hosts for hostname resolution.
Eg. IP: 192.168.129.10 Hostname: Node-1
Eg. IP: 192.168.129.11 Hostname: Node-2
iv. Hence for all ssh communication between the two nodes, hostname resolves
to subnet 129 address.
v. 12 services configured in active/passive mode
vi. 1 service configured in master/slave mode
vii. 8 services are non-sticky (they failback) in active/passive
viii. 4 services are sticky (do not failback) in active/passive
ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)

b. Observations:
i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
Node-2 was configured.
ii. On Node-1 all interfaces were up.
iii. Interface used by corosync for hearbeat/messaging was up at all times
(Bind net addr : 192.168.10.0)
iv. In crm_mon: Node-1 sees Node-2 as offline
cibadmin --query fails to work (remote node did not respond)
v. In crm_mon: Node-2 sees Node-1 as online
vi. All the services were seen active on Node-1 (including those that were
preferred for Node-2). Observed in crm_mon output.
vii. 4 services for which Node-2 was preferred were seen active Node-2 also
(hence 4 services active on both the nodes).
Observed in crm_mon output: Only 4 services were shown active, the status of
the rest of the services active on Node-1 did not reflect in crm_mon
Even though crm_mon on Node-2 sees Node-1 as “online”.
c. Errors in log file:
i. On Node-2:
1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
2. The above error appears for all the resources configured in pacemaker.


Query:
1) For what purpose does Pacemaker require “ssh without a pass key” to be
enabled between the nodes in a cluster ?
2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
come into picture ?
3) Let’s say in a two node cluster two communication paths are available between
the two nodes.
a. Eth1 and eth2.
b. The hostname of the node resolves to IP Address on eth1.
c. Consider, eth1 (network cable disconnected) goes down.
d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
eth1 addr).
e. Will this (hostname) have any issue ?




_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


df.cluster at gmail

Apr 19, 2012, 6:51 AM

Post #2 of 5 (1204 views)
Permalink
Re: crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline [In reply to]

Hi,

On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 [at] gmail> wrote:
> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
>  a. Use case:
>    i. Two nodes in a cluster (Call them Node-1 and Node-2)
>    ii. One interface configured in corosync.conf for its heartbeat or
> messaging. Eg. Bind net addr : 192.168.10.0
>    iii. Another interface configured in /etc/hosts for hostname resolution.
>    Eg. IP: 192.168.129.10 Hostname: Node-1
>    Eg. IP: 192.168.129.11 Hostname: Node-2
>    iv. Hence for all ssh communication between the two nodes, hostname resolves
> to subnet 129 address.
>    v. 12 services configured in active/passive mode
>    vi. 1 service configured in master/slave mode
>    vii. 8 services are non-sticky (they failback) in active/passive
>    viii. 4 services are sticky (do not failback) in active/passive
>    ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>
>  b. Observations:
>    i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
> Node-2 was configured.
>    ii. On Node-1 all interfaces were up.
>    iii. Interface used by corosync for hearbeat/messaging was up at all times
> (Bind net addr : 192.168.10.0)
>    iv. In crm_mon: Node-1 sees Node-2 as offline
>        cibadmin --query fails to work (remote node did not respond)
>    v. In crm_mon: Node-2 sees Node-1 as online
>    vi. All the services were seen active on Node-1 (including those that were
> preferred for Node-2). Observed in crm_mon output.
>    vii. 4 services for which Node-2 was preferred were seen active Node-2 also
> (hence 4 services active on both the nodes).
>    Observed in crm_mon output: Only 4 services were shown active, the status of
> the rest of the services active on Node-1 did not reflect in crm_mon
>    Even though crm_mon on Node-2 sees Node-1 as “online”.
>  c. Errors in log file:
>    i. On Node-2:
>      1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
>      2. The above error appears for all the resources configured in pacemaker.
>
>
> Query:
> 1) For what purpose does Pacemaker require “ssh without a pass key” to be
> enabled between the nodes in a cluster ?

scp

> 2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
> come into picture ?

When choosing where to allocate resources not explicitly tied to a node. See

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#node-score-equal

and

http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_background

> 3) Let’s say in a two node cluster two communication paths are available between
> the two nodes.
>  a. Eth1 and eth2.
>  b. The hostname of the node resolves to IP Address on eth1.
>  c. Consider, eth1 (network cable disconnected) goes down.
>  d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
> eth1 addr).

Inter-node communication is usually specified by IP address, and
redundant connections (as in your case) is recommended.

>  e. Will this (hostname) have any issue ?
>
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



--
Dan Frincu
CCNA, RHCE

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


andrew at beekhof

Apr 19, 2012, 5:09 PM

Post #3 of 5 (1200 views)
Permalink
Re: crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline [In reply to]

On Thu, Apr 19, 2012 at 11:51 PM, Dan Frincu <df.cluster [at] gmail> wrote:
> Hi,
>
> On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 [at] gmail> wrote:
>> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
>> a. Use case:
>> i. Two nodes in a cluster (Call them Node-1 and Node-2)
>> ii. One interface configured in corosync.conf for its heartbeat or
>> messaging. Eg. Bind net addr : 192.168.10.0
>> iii. Another interface configured in /etc/hosts for hostname resolution.
>> Eg. IP: 192.168.129.10 Hostname: Node-1
>> Eg. IP: 192.168.129.11 Hostname: Node-2
>> iv. Hence for all ssh communication between the two nodes, hostname resolves
>> to subnet 129 address.
>> v. 12 services configured in active/passive mode
>> vi. 1 service configured in master/slave mode
>> vii. 8 services are non-sticky (they failback) in active/passive
>> viii. 4 services are sticky (do not failback) in active/passive
>> ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
>> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>>
>> b. Observations:
>> i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
>> Node-2 was configured.
>> ii. On Node-1 all interfaces were up.
>> iii. Interface used by corosync for hearbeat/messaging was up at all times
>> (Bind net addr : 192.168.10.0)
>> iv. In crm_mon: Node-1 sees Node-2 as offline
>> cibadmin --query fails to work (remote node did not respond)
>> v. In crm_mon: Node-2 sees Node-1 as online
>> vi. All the services were seen active on Node-1 (including those that were
>> preferred for Node-2). Observed in crm_mon output.
>> vii. 4 services for which Node-2 was preferred were seen active Node-2 also
>> (hence 4 services active on both the nodes).
>> Observed in crm_mon output: Only 4 services were shown active, the status of
>> the rest of the services active on Node-1 did not reflect in crm_mon
>> Even though crm_mon on Node-2 sees Node-1 as online.
>> c. Errors in log file:
>> i. On Node-2:
>> 1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
>> 2. The above error appears for all the resources configured in pacemaker.
>>
>>
>> Query:
>> 1) For what purpose does Pacemaker require ssh without a pass key to be
>> enabled between the nodes in a cluster ?
>
> scp

But pacemaker doesn't use scp... or is this in relation to the
clusters from scratch document?
-ECONFUSED

>
>> 2) For what purpose does Pacemaker use Node hostname for ? how Node hostname
>> come into picture ?
>
> When choosing where to allocate resources not explicitly tied to a node. See
>
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#node-score-equal
>
> and
>
> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_background
>
>> 3) Lets say in a two node cluster two communication paths are available between
>> the two nodes.
>> a. Eth1 and eth2.
>> b. The hostname of the node resolves to IP Address on eth1.
>> c. Consider, eth1 (network cable disconnected) goes down.
>> d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
>> eth1 addr).
>
> Inter-node communication is usually specified by IP address, and
> redundant connections (as in your case) is recommended.
>
>> e. Will this (hostname) have any issue ?
>>
>>
>>
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
>
>
> --
> Dan Frincu
> CCNA, RHCE
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


bubble at hoster-ok

Apr 19, 2012, 11:38 PM

Post #4 of 5 (1180 views)
Permalink
Re: crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline [In reply to]

20.04.2012 03:09, Andrew Beekhof wrote:
> On Thu, Apr 19, 2012 at 11:51 PM, Dan Frincu <df.cluster [at] gmail> wrote:
>> Hi,
>>
>> On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 [at] gmail> wrote:
>>> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
>>> a. Use case:
>>> i. Two nodes in a cluster (Call them Node-1 and Node-2)
>>> ii. One interface configured in corosync.conf for its heartbeat or
>>> messaging. Eg. Bind net addr : 192.168.10.0
>>> iii. Another interface configured in /etc/hosts for hostname resolution.
>>> Eg. IP: 192.168.129.10 Hostname: Node-1
>>> Eg. IP: 192.168.129.11 Hostname: Node-2
>>> iv. Hence for all ssh communication between the two nodes, hostname resolves
>>> to subnet 129 address.
>>> v. 12 services configured in active/passive mode
>>> vi. 1 service configured in master/slave mode
>>> vii. 8 services are non-sticky (they failback) in active/passive
>>> viii. 4 services are sticky (do not failback) in active/passive
>>> ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
>>> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>>>
>>> b. Observations:
>>> i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
>>> Node-2 was configured.
>>> ii. On Node-1 all interfaces were up.
>>> iii. Interface used by corosync for hearbeat/messaging was up at all times
>>> (Bind net addr : 192.168.10.0)
>>> iv. In crm_mon: Node-1 sees Node-2 as offline
>>> cibadmin --query fails to work (remote node did not respond)
>>> v. In crm_mon: Node-2 sees Node-1 as online
>>> vi. All the services were seen active on Node-1 (including those that were
>>> preferred for Node-2). Observed in crm_mon output.
>>> vii. 4 services for which Node-2 was preferred were seen active Node-2 also
>>> (hence 4 services active on both the nodes).
>>> Observed in crm_mon output: Only 4 services were shown active, the status of
>>> the rest of the services active on Node-1 did not reflect in crm_mon
>>> Even though crm_mon on Node-2 sees Node-1 as online.
>>> c. Errors in log file:
>>> i. On Node-2:
>>> 1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
>>> 2. The above error appears for all the resources configured in pacemaker.
>>>
>>>
>>> Query:
>>> 1) For what purpose does Pacemaker require ssh without a pass key to be
>>> enabled between the nodes in a cluster ?
>>
>> scp
>
> But pacemaker doesn't use scp... or is this in relation to the
> clusters from scratch document?
> -ECONFUSED

hb_report?

>
>>
>>> 2) For what purpose does Pacemaker use Node hostname for ? how Node hostname
>>> come into picture ?
>>
>> When choosing where to allocate resources not explicitly tied to a node. See
>>
>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#node-score-equal
>>
>> and
>>
>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_background
>>
>>> 3) Lets say in a two node cluster two communication paths are available between
>>> the two nodes.
>>> a. Eth1 and eth2.
>>> b. The hostname of the node resolves to IP Address on eth1.
>>> c. Consider, eth1 (network cable disconnected) goes down.
>>> d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
>>> eth1 addr).
>>
>> Inter-node communication is usually specified by IP address, and
>> redundant connections (as in your case) is recommended.
>>
>>> e. Will this (hostname) have any issue ?
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker [at] oss
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> --
>> Dan Frincu
>> CCNA, RHCE
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


df.cluster at gmail

Apr 20, 2012, 12:31 AM

Post #5 of 5 (1233 views)
Permalink
Re: crm_mon on Node-2 shows both Node-1 & Node-2 as online but crm_mon on Node-1 shows Node-2 as offline [In reply to]

On Fri, Apr 20, 2012 at 3:09 AM, Andrew Beekhof <andrew [at] beekhof> wrote:
> On Thu, Apr 19, 2012 at 11:51 PM, Dan Frincu <df.cluster [at] gmail> wrote:
>> Hi,
>>
>> On Thu, Apr 19, 2012 at 3:56 PM, Parshvi <parshvi.17 [at] gmail> wrote:
>>> 1) What is the use of ssh without pass key between cluster nodes in pacemaker ?
>>>  a. Use case:
>>>    i. Two nodes in a cluster (Call them Node-1 and Node-2)
>>>    ii. One interface configured in corosync.conf for its heartbeat or
>>> messaging. Eg. Bind net addr : 192.168.10.0
>>>    iii. Another interface configured in /etc/hosts for hostname resolution.
>>>    Eg. IP: 192.168.129.10 Hostname: Node-1
>>>    Eg. IP: 192.168.129.11 Hostname: Node-2
>>>    iv. Hence for all ssh communication between the two nodes, hostname resolves
>>> to subnet 129 address.
>>>    v. 12 services configured in active/passive mode
>>>    vi. 1 service configured in master/slave mode
>>>    vii. 8 services are non-sticky (they failback) in active/passive
>>>    viii. 4 services are sticky (do not failback) in active/passive
>>>    ix. Distribution: Node-1 is primary for 8 services (of which 4 are non-
>>> sticky), Node-2 is preferred for 4 services of a total 12 (non-sticky)
>>>
>>>  b. Observations:
>>>    i. On Node-2, the interface was down over which IP: 192.168.129.11 Hostname:
>>> Node-2 was configured.
>>>    ii. On Node-1 all interfaces were up.
>>>    iii. Interface used by corosync for hearbeat/messaging was up at all times
>>> (Bind net addr : 192.168.10.0)
>>>    iv. In crm_mon: Node-1 sees Node-2 as offline
>>>        cibadmin --query fails to work (remote node did not respond)
>>>    v. In crm_mon: Node-2 sees Node-1 as online
>>>    vi. All the services were seen active on Node-1 (including those that were
>>> preferred for Node-2). Observed in crm_mon output.
>>>    vii. 4 services for which Node-2 was preferred were seen active Node-2 also
>>> (hence 4 services active on both the nodes).
>>>    Observed in crm_mon output: Only 4 services were shown active, the status of
>>> the rest of the services active on Node-1 did not reflect in crm_mon
>>>    Even though crm_mon on Node-2 sees Node-1 as “online”.
>>>  c. Errors in log file:
>>>    i. On Node-2:
>>>      1. Resource ocf::RscRA:rsc appears to be active on 2 nodes
>>>      2. The above error appears for all the resources configured in pacemaker.
>>>
>>>
>>> Query:
>>> 1) For what purpose does Pacemaker require “ssh without a pass key” to be
>>> enabled between the nodes in a cluster ?
>>
>> scp
>
> But pacemaker doesn't use scp... or is this in relation to the
> clusters from scratch document?

It's in relation to the Clusters from Scratch document.

> -ECONFUSED

Sorry about that ;)

>
>>
>>> 2) For what purpose does Pacemaker use Node “hostname” for ? how Node “hostname”
>>> come into picture ?
>>
>> When choosing where to allocate resources not explicitly tied to a node. See
>>
>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#node-score-equal
>>
>> and
>>
>> http://www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html-single/Pacemaker_Explained/#_background
>>
>>> 3) Let’s say in a two node cluster two communication paths are available between
>>> the two nodes.
>>>  a. Eth1 and eth2.
>>>  b. The hostname of the node resolves to IP Address on eth1.
>>>  c. Consider, eth1 (network cable disconnected) goes down.
>>>  d. Eth2 is up, but hostname does not resolve to the IP on eth2 (resolves to
>>> eth1 addr).
>>
>> Inter-node communication is usually specified by IP address, and
>> redundant connections (as in your case) is recommended.
>>
>>>  e. Will this (hostname) have any issue ?
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pacemaker mailing list: Pacemaker [at] oss
>>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>>
>>> Project Home: http://www.clusterlabs.org
>>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>>> Bugs: http://bugs.clusterlabs.org
>>
>>
>>
>> --
>> Dan Frincu
>> CCNA, RHCE
>>
>> _______________________________________________
>> Pacemaker mailing list: Pacemaker [at] oss
>> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>>
>> Project Home: http://www.clusterlabs.org
>> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
>> Bugs: http://bugs.clusterlabs.org
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org



--
Dan Frincu
CCNA, RHCE

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.