Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

Corosync / Pacemaker Cluster crashing

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


kobus.bensch at bauerservices

Apr 20, 2012, 3:08 AM

Post #1 of 4 (502 views)
Permalink
Corosync / Pacemaker Cluster crashing

Hi

I have the following cluster setup:

2 physical Dell servers with RHEL6.2 with all the latest patches.

Each server has 3 network connections that looks like this:

BOND0 2 NIC's

ETH4 for Corosync
ETH6 for corosync

This is the corosync config:
Cocorsync.conf
aisexec {
group: root
user: root
}

compatibility: whitetank
service {
use_mgmtd: yes
use_logd: yes
ver: 0
name: pacemaker
}
totem {
rrp_mode: active
join: 180
max_messages: 20
vsftype: none
token: 5000
consensus: 6000
secauth: on
token_retransmits_before_loss_const: 10
threads: 0
#threads: 16
version: 2
interface {
bindnetaddr: 10.255.1.0
mcastaddr: 232.10.1.1
mcastport: 5405
ringnumber: 0
ttl: 1
}
interface {
bindnetaddr: 10.255.2.0
mcastaddr: 232.10.2.1
mcastport: 5405
ringnumber: 1
ttl: 1
}
clear_node_high_bit: yes
}
logging {
to_logfile: yes
to_syslog: yes
debug: off
timestamp: on
logfile: /var/log/cluster/corosync.log
to_stderr: no
fileline: off
syslog_facility: daemon
}
amf {
mode: disabled
}

The pacemaker plugin:
/etc/corosync/service.d/pcmk
service {
# Load the Pacemaker Cluster Resource Manager
name: pacemaker
ver: 1
}

Corosync keeps crashing when I try to do anything in the crm cli. Whether it is moving resources, creating resources, it does not matter.

The corosync config for now is very simple and looks like this:
node lxdcv01nd01
node lxdcv01nd02
primitive lcdcv01 ocf:heartbeat:IPaddr2 \
params ip="10.1.0.95" cidr_netmask="32" \
op monitor interval="30s"
primitive local-manage ocf:heartbeat:IPaddr2 \
params ip="127.0.2.1" cidr_netmask="32" \
op monitor interval="30s"
location cli-prefer-lcdcv01 lcdcv01 \
rule $id="cli-prefer-rule-lcdcv01" inf: #uname eq lxdcv01nd02
location cli-prefer-local-manage local-manage \
rule $id="cli-prefer-rule-local-manage" inf: #uname eq lxdcv01nd02
property $id="cib-bootstrap-options" \
dc-version="1.0.12-unknown" \
cluster-infrastructure="openais" \
expected-quorum-votes="2" \
stonith-enabled="false" \
no-quorum-policy="ignore"

I tried to disable various config lines but still no joy. Any help would be appreciated.

When the server crashes I get this in the log:
Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:22 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:22 corosync [TOTEM ] FAILED TO RECEIVE
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11)
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Invalid argument (22)
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11)
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]: ERROR: ais_dispatch: Receiving message body failed: (2) Library error: Resource temporarily unavailable (11)
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]: ERROR: ais_dispatch: AIS connection failed
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]: ERROR: ais_dispatch: AIS connection failed
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: ERROR: ais_dispatch: AIS connection failed
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]: ERROR: crm_ais_destroy: AIS connection terminated
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]: ERROR: ais_dispatch: AIS connection failed
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]: ERROR: AIS connection terminated
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: CRIT: attrd_ais_destroy: Lost connection to OpenAIS service!
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]: ERROR: cib_ais_destroy: AIS connection terminated
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: info: main: Exiting...
Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: ERROR: attrd_cib_connection_destroy: Connection to the CIB terminated...
Apr 20 10:54:36 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'): started and ready to provide service.
Apr 20 10:54:36 corosync [MAIN ] Corosync built-in features: nss rdma
Apr 20 10:54:36 corosync [MAIN ] Successfully read main configuration file '/etc/corosync/corosync.conf'.
Apr 20 10:54:36 corosync [TOTEM ] Initializing transport (UDP/IP).
Apr 20 10:54:36 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Apr 20 10:54:36 corosync [TOTEM ] Initializing transport (UDP/IP).
Apr 20 10:54:36 corosync [TOTEM ] Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Apr 20 10:54:36 corosync [TOTEM ] The network interface [10.255.1.1] is now up.
Apr 20 10:54:36 corosync [pcmk ] info: process_ais_conf: Reading configure
Set r/w permissions for uid=0, gid=0 on /var/log/cluster/corosync.log
Apr 20 10:54:36 corosync [pcmk ] info: config_find_init: Local handle: 5650605097994944514 for logging
Apr 20 10:54:36 corosync [pcmk ] info: config_find_next: Processing additional logging options...
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'off' for option: debug
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for option: to_logfile
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found '/var/log/cluster/corosync.log' for option: logfile
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for option: to_syslog
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'daemon' for option: syslog_facility
Apr 20 10:54:36 corosync [pcmk ] info: config_find_init: Local handle: 2730409743423111171 for service
Apr 20 10:54:36 corosync [pcmk ] info: config_find_next: Processing additional service options...
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Defaulting to 'pcmk' for option: clustername
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for option: use_logd
Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for option: use_mgmtd
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
Apr 20 10:54:36 corosync [pcmk ] Logging: Initialized pcmk_startup
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: Maximum core file size is: 18446744073709551615
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: Service: 9
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: Local hostname: lxdcv01nd01.bauer-uk.bauermedia.group
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_update_nodeid: Local node id: 16908042
Apr 20 10:54:36 corosync [pcmk ] info: update_member: Creating entry for node 16908042 born on 0
Apr 20 10:54:36 corosync [pcmk ] info: update_member: 0x18db8e0 Node 16908042 now known as lxdcv01nd01.bauer-uk.bauermedia.group (was: (null))
Apr 20 10:54:36 corosync [pcmk ] info: update_member: Node lxdcv01nd01.bauer-uk.bauermedia.group now has 1 quorum votes (was 0)
Apr 20 10:54:36 corosync [pcmk ] info: update_member: Node 16908042/lxdcv01nd01.bauer-uk.bauermedia.group is now: member
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22445 for process stonithd
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22446 for process cib
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22447 for process lrmd
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [21256]: info: lrmd is shutting down
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: WARN: Initializing connection to logging daemon failed. Logging daemon may not be running
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: WARN: Initializing connection to logging daemon failed. Logging daemon may not be running
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22448 for process attrd
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN: Initializing connection to logging daemon failed. Logging daemon may not be running
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: G_main_add_SignalHandler: Added signal handler for signal 10
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: Signal sent to pid=21256, waiting for process to exit
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22449 for process pengine
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: WARN: Initializing connection to logging daemon failed. Logging daemon may not be running
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: Invoked: /usr/lib64/heartbeat/cib
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: G_main_add_SignalHandler: Added signal handler for signal 12
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]: WARN: Initializing connection to logging daemon failed. Logging daemon may not be running
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22450 for process crmd
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: Invoked: /usr/lib64/heartbeat/attrd
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: G_main_add_TriggerHandler: Added signal manual handler
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]: info: Invoked: /usr/lib64/heartbeat/pengine
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: WARN: Initializing connection to logging daemon failed. Logging daemon may not be running
Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22451 for process mgmtd
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: main: Starting up
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]: WARN: main: Terminating previous PE instance
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: crm_cluster_connect: Connecting to OpenAIS
Apr 20 10:54:36 corosync [SERV ] Service engine loaded: Pacemaker Cluster Manager 1.0.12
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: crm_cluster_connect: Connecting to OpenAIS
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [21258]: WARN: process_pe_message: Received quit message, terminating
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: init_ais_connection_once: Creating connection to our AIS plugin
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: Invoked: /usr/lib64/heartbeat/crmd
Apr 20 10:54:36 corosync [SERV ] Service failed to load 'pacemaker'.
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: init_ais_connection_once: Creating connection to our AIS plugin
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: main: CRM Hg Version: unknown

Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync extended virtual synchrony service
Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync configuration service
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crmd_init: Starting crmd
Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync cluster closed process group service v1.01
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync cluster config database access v1.01
Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync profile loading service
Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync cluster quorum service v0.1
Apr 20 10:54:36 corosync [MAIN ] Compatibility mode set to whitetank. Using V1 and V2 of the synchronization engine.
Apr 20 10:54:36 corosync [TOTEM ] The network interface [10.255.2.1] is now up.
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: init_ais_connection_once: AIS connection established
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: startCib: CIB Initialization completed successfully
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_cluster_connect: Connecting to OpenAIS
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: init_ais_connection_once: Creating connection to our AIS plugin
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: init_ais_connection_once: AIS connection established
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x18e7150 for stonithd/22445
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x18eb4b0 for attrd/22448
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: get_ais_nodeid: Server details: id=16908042 uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: get_ais_nodeid: Server details: id=16908042 uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id: 16908042
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id: 16908042
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: crm_new_peer: Node 16908042 is now known as lxdcv01nd01.bauer-uk.bauermedia.group
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: main: Cluster connection active
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: crm_new_peer: Node 16908042 is now known as lxdcv01nd01.bauer-uk.bauermedia.group
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: main: Accepting attribute updates
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: main: Starting mainloop...
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: notice: /usr/lib64/heartbeat/stonithd start up successfully.
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: init_ais_connection_once: AIS connection established
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x18ef810 for cib/22446
Apr 20 10:54:36 corosync [pcmk ] info: update_member: Node lxdcv01nd01.bauer-uk.bauermedia.group now has process list: 00000000000000000000000000053312 (340754)
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Sending membership update 0 to cib
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: get_ais_nodeid: Server details: id=16908042 uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id: 16908042
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_new_peer: Node 16908042 is now known as lxdcv01nd01.bauer-uk.bauermedia.group
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: cib_init: Starting cib mainloop
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: ais_dispatch: Membership 0: quorum still lost
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group: id=16908042 state=member (new) addr=(null) votes=1 (new) born=0 seen=0 proc=00000000000000000000000000053312 (new)
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-80.raw
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]: info: write_cib_contents: Wrote version 0.89.0 of the CIB to disk (digest: e15d151e0fed09d1d411b21b345a8952)
Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.cZHXQX (digest: /var/lib/heartbeat/crm/cib.U3NqAd)
Apr 20 10:54:36 corosync [TOTEM ] Incrementing problem counter for seqid 1 iface 10.255.2.1 to [1 of 10]
Apr 20 10:54:36 corosync [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 208: memb=0, new=0, lost=0
Apr 20 10:54:36 corosync [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 208: memb=1, new=1, lost=0
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_peer_update: NEW: lxdcv01nd01.bauer-uk.bauermedia.group 16908042
Apr 20 10:54:36 corosync [pcmk ] info: pcmk_peer_update: MEMB: lxdcv01nd01.bauer-uk.bauermedia.group 16908042
Apr 20 10:54:36 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 20 10:54:36 corosync [MAIN ] Completed service synchronization, ready to provide service.
Apr 20 10:54:37 corosync [pcmk ] notice: pcmk_peer_update: Transitional membership event on ring 212: memb=1, new=0, lost=0
Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: memb: lxdcv01nd01.bauer-uk.bauermedia.group 16908042
Apr 20 10:54:37 corosync [pcmk ] notice: pcmk_peer_update: Stable membership event on ring 212: memb=2, new=1, lost=0
Apr 20 10:54:37 corosync [pcmk ] info: update_member: Creating entry for node 33685258 born on 212
Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node 33685258/unknown is now: member
Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: NEW: .pending. 33685258
Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: MEMB: lxdcv01nd01.bauer-uk.bauermedia.group 16908042
Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: MEMB: .pending. 33685258
Apr 20 10:54:37 corosync [pcmk ] info: send_member_notification: Sending membership update 212 to 1 children
Apr 20 10:54:37 corosync [pcmk ] info: update_member: 0x18db8e0 Node 16908042 (lxdcv01nd01.bauer-uk.bauermedia.group) born on: 212
Apr 20 10:54:37 corosync [TOTEM ] A processor joined or left the membership and a new membership was formed.
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: ais_dispatch: Membership 212: quorum still lost
Apr 20 10:54:37 corosync [pcmk ] info: update_member: 0x18e6ac0 Node 33685258 ((null)) born on: 196
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_new_peer: Node <null> now has id: 33685258
Apr 20 10:54:37 corosync [pcmk ] info: update_member: 0x18e6ac0 Node 33685258 now known as lxdcv01nd02.bauer-uk.bauermedia.group (was: (null))
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_update_peer: Node (null): id=33685258 state=member (new) addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2) votes=0 born=0 seen=212 proc=00000000000000000000000000000000
Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node lxdcv01nd02.bauer-uk.bauermedia.group now has process list: 00000000000000000000000000013312 (78610)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group: id=16908042 state=member addr=r(0) ip(10.255.1.1) r(1) ip(10.255.2.1) (new) votes=1 born=0 seen=212 proc=00000000000000000000000000053312
Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node lxdcv01nd02.bauer-uk.bauermedia.group now has 1 quorum votes (was 0)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: notice: ais_dispatch: Membership 212: quorum acquired
Apr 20 10:54:37 corosync [pcmk ] info: send_member_notification: Sending membership update 212 to 1 children
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_get_peer: Node 33685258 is now known as lxdcv01nd02.bauer-uk.bauermedia.group
Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: unknown (rc=-2)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: crm_update_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group: id=33685258 state=member addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2) votes=1 (new) born=196 seen=212 proc=00000000000000000000000000013312 (new)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: cib_process_diff: Diff 0.91.3 -> 0.91.4 not applied to 0.89.0: current "epoch" is less than required
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: cib_server_process_diff: Requesting re-sync from peer
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN: cib_diff_notify: Local-only Change (client:crmd, call: 77): -1.-1.-1 (Application of an update diff failed, requesting a full refresh)
Apr 20 10:54:37 corosync [MAIN ] Completed service synchronization, ready to provide service.
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN: cib_server_process_diff: Not applying diff 0.91.4 -> 0.91.5 (sync in progress)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN: cib_server_process_diff: Not applying diff 0.91.5 -> 0.91.6 (sync in progress)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN: cib_server_process_diff: Not applying diff 0.91.6 -> 0.92.1 (sync in progress)
Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: unknown (rc=-2)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info: cib_replace_notify: Local-only Replace: -1.-1.-1 from lxdcv01nd02.bauer-uk.bauermedia.group
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]: info: write_cib_contents: Archived previous version as /var/lib/heartbeat/crm/cib-81.raw
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]: info: write_cib_contents: Wrote version 0.92.0 of the CIB to disk (digest: 65cf2f5895618dbd08c40b8c39a479c5)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]: info: retrieveCib: Reading cluster configuration from: /var/lib/heartbeat/crm/cib.8nhla0 (digest: /var/lib/heartbeat/crm/cib.nUDbdi)
Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: unknown (rc=-2)
Apr 20 10:54:37 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child process mgmtd exited (pid=22451, rc=100)
Apr 20 10:54:37 corosync [pcmk ] notice: pcmk_wait_dispatch: Child process mgmtd no longer wishes to be respawned
Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node lxdcv01nd01.bauer-uk.bauermedia.group now has process list: 00000000000000000000000000013312 (78610)
Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending message to local.crmd failed: unknown (rc=-2)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: G_main_add_SignalHandler: Added signal handler for signal 15
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: enabling coredumps
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: G_main_add_SignalHandler: Added signal handler for signal 10
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: G_main_add_SignalHandler: Added signal handler for signal 12
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: Started.
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_cib_control: CIB connection established
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_cluster_connect: Connecting to OpenAIS
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: init_ais_connection_once: Creating connection to our AIS plugin
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: init_ais_connection_once: AIS connection established
Apr 20 10:54:37 corosync [pcmk ] info: pcmk_ipc: Recorded connection 0x18f53c0 for crmd/22450
Apr 20 10:54:37 corosync [pcmk ] info: pcmk_ipc: Sending membership update 212 to crmd
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: get_ais_nodeid: Server details: id=16908042 uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id: 16908042
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_new_peer: Node 16908042 is now known as lxdcv01nd01.bauer-uk.bauermedia.group
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_ha_control: Connected to the cluster
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_started: Delaying start, CCM (0000000000100000) not connected
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crmd_init: Starting crmd's mainloop
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: config_query_callback: Checking for expired actions every 900000ms
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: config_query_callback: Sending expected-votes=2 to corosync
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: notice: ais_dispatch: Membership 212: quorum acquired
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_new_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group now has id: 33685258
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_new_peer: Node 33685258 is now known as lxdcv01nd02.bauer-uk.bauermedia.group
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_update_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group: id=33685258 state=member (new) addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2) votes=1 born=196 seen=212 proc=00000000000000000000000000013312
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group: id=16908042 state=member (new) addr=r(0) ip(10.255.1.1) r(1) ip(10.255.2.1) (new) votes=1 (new) born=212 seen=212 proc=00000000000000000000000000013312 (new)
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_started: The local CRM is operational
Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_state_transition: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]: info: main: Starting pengine
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: ais_dispatch: Membership 212: quorum retained
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: update_dc: Set DC to lxdcv01nd02.bauer-uk.bauermedia.group (3.0.1)
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: update_attrd: Connecting to attrd...
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: find_hash_entry: Creating hash entry for terminate
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: find_hash_entry: Creating hash entry for shutdown
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_local_callback: Sending full refresh (origin=crmd)
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: erase_xpath_callback: Deletion of "//node_state[@uname='lxdcv01nd01.bauer-uk.bauermedia.group']/transient_attributes": ok (rc=0)
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: crm_new_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group now has id: 33685258
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: crm_new_peer: Node 33685258 is now known as lxdcv01nd02.bauer-uk.bauermedia.group
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: find_hash_entry: Creating hash entry for probe_complete
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_perform_update: Delaying operation probe_complete=<null>: cib not connected
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_lrm_rsc_op: Performing key=6:4:7:e6a3b9c7-c24d-497a-9c07-d6082ee231a9 op=local-manage_monitor_0 )
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: rsc:local-manage:2: probe
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_lrm_rsc_op: Performing key=7:4:7:e6a3b9c7-c24d-497a-9c07-d6082ee231a9 op=lcdcv01_monitor_0 )
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: rsc:lcdcv01:3: probe
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: process_lrm_event: LRM operation lcdcv01_monitor_0 (call=3, rc=0, cib-update=7, confirmed=true) ok
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: process_lrm_event: LRM operation local-manage_monitor_0 (call=2, rc=7, cib-update=8, confirmed=true) not running
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_perform_update: Delaying operation probe_complete=true: cib not connected
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: do_lrm_rsc_op: Performing key=9:5:0:e6a3b9c7-c24d-497a-9c07-d6082ee231a9 op=lcdcv01_stop_0 )
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: rsc:lcdcv01:4: stop
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_perform_update: Delaying operation probe_complete=true: cib not connected
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info: RA output: (lcdcv01:stop:stderr) logd is not running
Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info: process_lrm_event: LRM operation lcdcv01_stop_0 (call=4, rc=0, cib-update=9, confirmed=true) ok
Apr 20 10:54:38 corosync [TOTEM ] ring 1 active with no faults
Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: cib_connect: Connected to the CIB after 1 signon attempts
Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: cib_connect: Sending full refresh
Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_trigger_update: Sending flush op to all hosts for: probe_complete (true)
Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info: attrd_perform_update: Sent update 4: probe_complete=true



Bauer Corporate Services UK LP (BCS) is a division of the Bauer Media Group the
largest consumer publisher in the UK, and second largest commercial radio
broadcaster. BCS provides financial services and manages and develops IT systems
on which our UK publishing, broadcast, digital and partner businesses depend.

The information in this email is intended only for the addressee(s) named above.
Access to this email by anyone else is unauthorised. If you are not the intended
recipient of this message any disclosure, copying, distribution or any action
taken in reliance on it is prohibited and may be unlawful. Bauer Corporate
Services do not warrant that any attachments are free from viruses or other
defects and accept no liability for any losses resulting from infected email
transmissions.

Please note that any views expressed in this email may be those of the
originator and do not necessarily reflect those of this organisation.

Bauer Corporate Services UK LP is registered in England; Registered address is
1 Lincoln Court, Lincoln Road, Peterborough, PE1 2RF.

Registration number LP13195


df.cluster at gmail

Apr 20, 2012, 3:30 AM

Post #2 of 4 (412 views)
Permalink
Re: Corosync / Pacemaker Cluster crashing [In reply to]

Hi,

On Fri, Apr 20, 2012 at 1:08 PM, Bensch, Kobus
<kobus.bensch [at] bauerservices> wrote:
> Hi
>
> I have the following cluster setup:
>
> 2 physical Dell servers with RHEL6.2 with all the latest patches.
>
> Each server has 3 network connections that looks like this:
>
> BOND0 2 NIC's
>
> ETH4 for Corosync
> ETH6 for corosync
>
> This is the corosync config:
> Cocorsync.conf
> aisexec {
> group: root
> user: root
> }
>
> compatibility: whitetank
> service {
> use_mgmtd: yes
> use_logd: yes
> ver: 0
> name: pacemaker
> }
> totem {
> rrp_mode: active
> join: 180
> max_messages: 20
> vsftype: none
> token: 5000
> consensus: 6000
> secauth: on
> token_retransmits_before_loss_const: 10
> threads: 0
> #threads: 16
> version: 2
> interface {
> bindnetaddr: 10.255.1.0
> mcastaddr: 232.10.1.1
> mcastport: 5405
> ringnumber: 0
> ttl: 1
> }
> interface {
> bindnetaddr: 10.255.2.0
> mcastaddr: 232.10.2.1
> mcastport: 5405
> ringnumber: 1
> ttl: 1
> }
> clear_node_high_bit: yes
> }
> logging {
> to_logfile: yes
> to_syslog: yes
> debug: off
> timestamp: on
> logfile: /var/log/cluster/corosync.log
> to_stderr: no
> fileline: off
> syslog_facility: daemon
> }
> amf {
> mode: disabled
> }
>
> The pacemaker plugin:
> /etc/corosync/service.d/pcmk
> service {
>         # Load the Pacemaker Cluster Resource Manager
>         name: pacemaker
>         ver:  1
> }
>
> Corosync keeps crashing when I try to do anything in the crm cli. Whether it
> is moving resources, creating resources, it does not matter.
>
> The corosync config for now is very simple and looks like this:
> node lxdcv01nd01
> node lxdcv01nd02
> primitive lcdcv01 ocf:heartbeat:IPaddr2 \
> params ip="10.1.0.95" cidr_netmask="32" \
> op monitor interval="30s"
> primitive local-manage ocf:heartbeat:IPaddr2 \
> params ip="127.0.2.1" cidr_netmask="32" \
> op monitor interval="30s"
> location cli-prefer-lcdcv01 lcdcv01 \
> rule $id="cli-prefer-rule-lcdcv01" inf: #uname eq lxdcv01nd02
> location cli-prefer-local-manage local-manage \
> rule $id="cli-prefer-rule-local-manage" inf: #uname eq lxdcv01nd02
> property $id="cib-bootstrap-options" \
> dc-version="1.0.12-unknown" \

First glitch in the matrix, what version of Pacemaker are you running?
1.0.12-unknown seems fishy (self compiled maybe?)

> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> no-quorum-policy="ignore"
>
> I tried to disable various config lines but still no joy. Any help would be
> appreciated.
>
> When the server crashes I get this in the log:
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:22 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:22 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]: ERROR:
> ais_dispatch: Receiving message body failed: (2) Library error: Resource
> temporarily unavailable (11)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]:
> ERROR: ais_dispatch: Receiving message body failed: (2) Library error:
> Invalid argument (22)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: ERROR:
> ais_dispatch: Receiving message body failed: (2) Library error: Resource
> temporarily unavailable (11)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]: ERROR:
> ais_dispatch: Receiving message body failed: (2) Library error: Resource
> temporarily unavailable (11)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]: ERROR:
> ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]:
> ERROR: ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: ERROR:
> ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]: ERROR:
> crm_ais_destroy: AIS connection terminated
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]: ERROR:
> ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]:
> ERROR: AIS connection terminated
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: CRIT:
> attrd_ais_destroy: Lost connection to OpenAIS service!
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]: ERROR:
> cib_ais_destroy: AIS connection terminated
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: info:
> main: Exiting...
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]: ERROR:
> attrd_cib_connection_destroy: Connection to the CIB terminated...
> Apr 20 10:54:36 corosync [MAIN  ] Corosync Cluster Engine ('1.2.7'): started
> and ready to provide service.

Corosync 1.2.7 and rrp_mode active don't work well, I say "well"
because I think they don't work at all but just to be on the safe
side, I say "don't work well".

IIRC for RHEL 6.2 Corosync is at 1.4.x which begs the question, where
did you get your packages from.

Are you running on Oracle's RHEL clone by any chance?

> Apr 20 10:54:36 corosync [MAIN  ] Corosync built-in features: nss rdma
> Apr 20 10:54:36 corosync [MAIN  ] Successfully read main configuration file
> '/etc/corosync/corosync.conf'.
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transport (UDP/IP).
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transmit/receive security:
> libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transport (UDP/IP).
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transmit/receive security:
> libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Apr 20 10:54:36 corosync [TOTEM ] The network interface [10.255.1.1] is now
> up.
> Apr 20 10:54:36 corosync [pcmk  ] info: process_ais_conf: Reading configure
> Set r/w permissions for uid=0, gid=0 on /var/log/cluster/corosync.log
> Apr 20 10:54:36 corosync [pcmk  ] info: config_find_init: Local handle:
> 5650605097994944514 for logging
> Apr 20 10:54:36 corosync [pcmk  ] info: config_find_next: Processing
> additional logging options...
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found 'off' for
> option: debug
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found 'yes' for
> option: to_logfile
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found
> '/var/log/cluster/corosync.log' for option: logfile
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found 'yes' for
> option: to_syslog
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found 'daemon' for
> option: syslog_facility
> Apr 20 10:54:36 corosync [pcmk  ] info: config_find_init: Local handle:
> 2730409743423111171 for service
> Apr 20 10:54:36 corosync [pcmk  ] info: config_find_next: Processing
> additional service options...
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Defaulting to 'pcmk'
> for option: clustername
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found 'yes' for
> option: use_logd
> Apr 20 10:54:36 corosync [pcmk  ] info: get_config_opt: Found 'yes' for
> option: use_mgmtd
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_startup: CRM: Initialized
> Apr 20 10:54:36 corosync [pcmk  ] Logging: Initialized pcmk_startup
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_startup: Maximum core file size
> is: 18446744073709551615
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_startup: Service: 9
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_startup: Local hostname:
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_update_nodeid: Local node id:
> 16908042
> Apr 20 10:54:36 corosync [pcmk  ] info: update_member: Creating entry for
> node 16908042 born on 0
> Apr 20 10:54:36 corosync [pcmk  ] info: update_member: 0x18db8e0 Node
> 16908042 now known as lxdcv01nd01.bauer-uk.bauermedia.group (was: (null))
> Apr 20 10:54:36 corosync [pcmk  ] info: update_member: Node
> lxdcv01nd01.bauer-uk.bauermedia.group now has 1 quorum votes (was 0)
> Apr 20 10:54:36 corosync [pcmk  ] info: update_member: Node
> 16908042/lxdcv01nd01.bauer-uk.bauermedia.group is now: member
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22445 for
> process stonithd
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22446 for
> process cib
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22447 for
> process lrmd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [21256]: info:
> lrmd is shutting down
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> WARN: Initializing connection to logging daemon failed. Logging daemon may
> not be running
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: WARN:
> Initializing connection to logging daemon failed. Logging daemon may not be
> running
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22448 for
> process attrd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN:
> Initializing connection to logging daemon failed. Logging daemon may not be
> running
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: G_main_add_SignalHandler: Added signal handler for signal 10
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> Signal sent to pid=21256, waiting for process to exit
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22449 for
> process pengine
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: WARN:
> Initializing connection to logging daemon failed. Logging daemon may not be
> running
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> Invoked: /usr/lib64/heartbeat/cib
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: G_main_add_SignalHandler: Added signal handler for signal 12
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> WARN: Initializing connection to logging daemon failed. Logging daemon may
> not be running
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22450 for
> process crmd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> Invoked: /usr/lib64/heartbeat/attrd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> G_main_add_TriggerHandler: Added signal manual handler
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> info: Invoked: /usr/lib64/heartbeat/pengine
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: WARN:
> Initializing connection to logging daemon failed. Logging daemon may not be
> running
> Apr 20 10:54:36 corosync [pcmk  ] info: spawn_child: Forked child 22451 for
> process mgmtd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> main: Starting up
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> WARN: main: Terminating previous PE instance
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> retrieveCib: Reading cluster configuration from:
> /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: Pacemaker Cluster
> Manager 1.0.12
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [21258]:
> WARN: process_pe_message: Received quit message, terminating
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> Invoked: /usr/lib64/heartbeat/crmd
> Apr 20 10:54:36 corosync [SERV  ] Service failed to load 'pacemaker'.
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> main: CRM Hg Version: unknown
>
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: corosync extended
> virtual synchrony service
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: corosync
> configuration service
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crmd_init: Starting crmd
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: corosync cluster
> closed process group service v1.01
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: corosync cluster
> config database access v1.01
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: corosync profile
> loading service
> Apr 20 10:54:36 corosync [SERV  ] Service engine loaded: corosync cluster
> quorum service v0.1
> Apr 20 10:54:36 corosync [MAIN  ] Compatibility mode set to whitetank.
>  Using V1 and V2 of the synchronization engine.
> Apr 20 10:54:36 corosync [TOTEM ] The network interface [10.255.2.1] is now
> up.
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: init_ais_connection_once: AIS connection established
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> startCib: CIB Initialization completed successfully
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> init_ais_connection_once: AIS connection established
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_ipc: Recorded connection
> 0x18e7150 for stonithd/22445
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_ipc: Recorded connection
> 0x18eb4b0 for attrd/22448
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id:
> 16908042
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id:
> 16908042
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> main: Cluster connection active
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> main: Accepting attribute updates
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> main: Starting mainloop...
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> notice: /usr/lib64/heartbeat/stonithd start up successfully.
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> init_ais_connection_once: AIS connection established
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_ipc: Recorded connection
> 0x18ef810 for cib/22446
> Apr 20 10:54:36 corosync [pcmk  ] info: update_member: Node
> lxdcv01nd01.bauer-uk.bauermedia.group now has process list:
> 00000000000000000000000000053312 (340754)
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_ipc: Sending membership update
> 0 to cib
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id:
> 16908042
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> cib_init: Starting cib mainloop
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> ais_dispatch: Membership 0: quorum still lost
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group: id=16908042
> state=member (new) addr=(null) votes=1 (new) born=0 seen=0
> proc=00000000000000000000000000053312 (new)
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]: info:
> write_cib_contents: Archived previous version as
> /var/lib/heartbeat/crm/cib-80.raw
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]: info:
> write_cib_contents: Wrote version 0.89.0 of the CIB to disk (digest:
> e15d151e0fed09d1d411b21b345a8952)
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]: info:
> retrieveCib: Reading cluster configuration from:
> /var/lib/heartbeat/crm/cib.cZHXQX (digest:
> /var/lib/heartbeat/crm/cib.U3NqAd)
> Apr 20 10:54:36 corosync [TOTEM ] Incrementing problem counter for seqid 1
> iface 10.255.2.1 to [1 of 10]
> Apr 20 10:54:36 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 208: memb=0, new=0, lost=0
> Apr 20 10:54:36 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 208: memb=1, new=1, lost=0
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_peer_update: NEW:
>  lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:36 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:36 corosync [TOTEM ] A processor joined or left the membership
> and a new membership was formed.
> Apr 20 10:54:36 corosync [MAIN  ] Completed service synchronization, ready
> to provide service.
> Apr 20 10:54:37 corosync [pcmk  ] notice: pcmk_peer_update: Transitional
> membership event on ring 212: memb=1, new=0, lost=0
> Apr 20 10:54:37 corosync [pcmk  ] info: pcmk_peer_update: memb:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:37 corosync [pcmk  ] notice: pcmk_peer_update: Stable
> membership event on ring 212: memb=2, new=1, lost=0
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: Creating entry for
> node 33685258 born on 212
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: Node 33685258/unknown
> is now: member
> Apr 20 10:54:37 corosync [pcmk  ] info: pcmk_peer_update: NEW:  .pending.
> 33685258
> Apr 20 10:54:37 corosync [pcmk  ] info: pcmk_peer_update: MEMB:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:37 corosync [pcmk  ] info: pcmk_peer_update: MEMB: .pending.
> 33685258
> Apr 20 10:54:37 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 212 to 1 children
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: 0x18db8e0 Node
> 16908042 (lxdcv01nd01.bauer-uk.bauermedia.group) born on: 212
> Apr 20 10:54:37 corosync [TOTEM ] A processor joined or left the membership
> and a new membership was formed.
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> ais_dispatch: Membership 212: quorum still lost
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: 0x18e6ac0 Node
> 33685258 ((null)) born on: 196
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_new_peer: Node <null> now has id: 33685258
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: 0x18e6ac0 Node
> 33685258 now known as lxdcv01nd02.bauer-uk.bauermedia.group (was: (null))
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_update_peer: Node (null): id=33685258 state=member (new) addr=r(0)
> ip(10.255.1.2) r(1) ip(10.255.2.2)  votes=0 born=0 seen=212
> proc=00000000000000000000000000000000
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: Node
> lxdcv01nd02.bauer-uk.bauermedia.group now has process list:
> 00000000000000000000000000013312 (78610)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group: id=16908042
> state=member addr=r(0) ip(10.255.1.1) r(1) ip(10.255.2.1)  (new) votes=1
> born=0 seen=212 proc=00000000000000000000000000053312
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: Node
> lxdcv01nd02.bauer-uk.bauermedia.group now has 1 quorum votes (was 0)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: notice:
> ais_dispatch: Membership 212: quorum acquired
> Apr 20 10:54:37 corosync [pcmk  ] info: send_member_notification: Sending
> membership update 212 to 1 children
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_get_peer: Node 33685258 is now known as
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:37 corosync [pcmk  ] WARN: route_ais_message: Sending message
> to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> crm_update_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group: id=33685258
> state=member addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2)  votes=1 (new)
> born=196 seen=212 proc=00000000000000000000000000013312 (new)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> cib_process_diff: Diff 0.91.3 -> 0.91.4 not applied to 0.89.0: current
> "epoch" is less than required
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> cib_server_process_diff: Requesting re-sync from peer
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN:
> cib_diff_notify: Local-only Change (client:crmd, call: 77): -1.-1.-1
> (Application of an update diff failed, requesting a full refresh)
> Apr 20 10:54:37 corosync [MAIN  ] Completed service synchronization, ready
> to provide service.
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN:
> cib_server_process_diff: Not applying diff 0.91.4 -> 0.91.5 (sync in
> progress)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN:
> cib_server_process_diff: Not applying diff 0.91.5 -> 0.91.6 (sync in
> progress)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: WARN:
> cib_server_process_diff: Not applying diff 0.91.6 -> 0.92.1 (sync in
> progress)
> Apr 20 10:54:37 corosync [pcmk  ] WARN: route_ais_message: Sending message
> to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]: info:
> cib_replace_notify: Local-only Replace: -1.-1.-1 from
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]: info:
> write_cib_contents: Archived previous version as
> /var/lib/heartbeat/crm/cib-81.raw
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]: info:
> write_cib_contents: Wrote version 0.92.0 of the CIB to disk (digest:
> 65cf2f5895618dbd08c40b8c39a479c5)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]: info:
> retrieveCib: Reading cluster configuration from:
> /var/lib/heartbeat/crm/cib.8nhla0 (digest:
> /var/lib/heartbeat/crm/cib.nUDbdi)
> Apr 20 10:54:37 corosync [pcmk  ] WARN: route_ais_message: Sending message
> to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 corosync [pcmk  ] ERROR: pcmk_wait_dispatch: Child process
> mgmtd exited (pid=22451, rc=100)
> Apr 20 10:54:37 corosync [pcmk  ] notice: pcmk_wait_dispatch: Child process
> mgmtd no longer wishes to be respawned
> Apr 20 10:54:37 corosync [pcmk  ] info: update_member: Node
> lxdcv01nd01.bauer-uk.bauermedia.group now has process list:
> 00000000000000000000000000013312 (78610)
> Apr 20 10:54:37 corosync [pcmk  ] WARN: route_ais_message: Sending message
> to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> G_main_add_SignalHandler: Added signal handler for signal 15
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> enabling coredumps
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> G_main_add_SignalHandler: Added signal handler for signal 10
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> G_main_add_SignalHandler: Added signal handler for signal 12
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> Started.
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_cib_control: CIB connection established
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> init_ais_connection_once: AIS connection established
> Apr 20 10:54:37 corosync [pcmk  ] info: pcmk_ipc: Recorded connection
> 0x18f53c0 for crmd/22450
> Apr 20 10:54:37 corosync [pcmk  ] info: pcmk_ipc: Sending membership update
> 212 to crmd
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has id:
> 16908042
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_ha_control: Connected to the cluster
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_started: Delaying start, CCM (0000000000100000) not connected
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crmd_init: Starting crmd's mainloop
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> config_query_callback: Checking for expired actions every 900000ms
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> config_query_callback: Sending expected-votes=2 to corosync
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: notice:
> ais_dispatch: Membership 212: quorum acquired
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_new_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group now has id:
> 33685258
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_new_peer: Node 33685258 is now known as
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_update_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group: id=33685258
> state=member (new) addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2)  votes=1
> born=196 seen=212 proc=00000000000000000000000000013312
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group: id=16908042
> state=member (new) addr=r(0) ip(10.255.1.1) r(1) ip(10.255.2.1)  (new)
> votes=1 (new) born=212 seen=212 proc=00000000000000000000000000013312 (new)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_started: The local CRM is operational
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_state_transition: State transition S_STARTING -> S_PENDING [
> input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> info: main: Starting pengine
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> ais_dispatch: Membership 212: quorum retained
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> update_dc: Set DC to lxdcv01nd02.bauer-uk.bauermedia.group (3.0.1)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> update_attrd: Connecting to attrd...
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> find_hash_entry: Creating hash entry for terminate
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_state_transition: State transition S_PENDING -> S_NOT_DC [ input=I_NOT_DC
> cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> find_hash_entry: Creating hash entry for shutdown
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_local_callback: Sending full refresh (origin=crmd)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> erase_xpath_callback: Deletion of
> "//node_state[@uname='lxdcv01nd01.bauer-uk.bauermedia.group']/transient_attributes":
> ok (rc=0)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> crm_new_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group now has id:
> 33685258
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> crm_new_peer: Node 33685258 is now known as
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> find_hash_entry: Creating hash entry for probe_complete
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_perform_update: Delaying operation probe_complete=<null>: cib not
> connected
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_lrm_rsc_op: Performing key=6:4:7:e6a3b9c7-c24d-497a-9c07-d6082ee231a9
> op=local-manage_monitor_0 )
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> rsc:local-manage:2: probe
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_lrm_rsc_op: Performing key=7:4:7:e6a3b9c7-c24d-497a-9c07-d6082ee231a9
> op=lcdcv01_monitor_0 )
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> rsc:lcdcv01:3: probe
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> process_lrm_event: LRM operation lcdcv01_monitor_0 (call=3, rc=0,
> cib-update=7, confirmed=true) ok
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> process_lrm_event: LRM operation local-manage_monitor_0 (call=2, rc=7,
> cib-update=8, confirmed=true) not running
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_trigger_update: Sending flush op to all hosts for: probe_complete
> (true)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_perform_update: Delaying operation probe_complete=true: cib not
> connected
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> do_lrm_rsc_op: Performing key=9:5:0:e6a3b9c7-c24d-497a-9c07-d6082ee231a9
> op=lcdcv01_stop_0 )
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> rsc:lcdcv01:4: stop
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_trigger_update: Sending flush op to all hosts for: probe_complete
> (true)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_perform_update: Delaying operation probe_complete=true: cib not
> connected
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]: info:
> RA output: (lcdcv01:stop:stderr) logd is not running
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]: info:
> process_lrm_event: LRM operation lcdcv01_stop_0 (call=4, rc=0, cib-update=9,
> confirmed=true) ok
> Apr 20 10:54:38 corosync [TOTEM ] ring 1 active with no faults
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> cib_connect: Connected to the CIB after 1 signon attempts
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> cib_connect: Sending full refresh
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_trigger_update: Sending flush op to all hosts for: probe_complete
> (true)
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]: info:
> attrd_perform_update: Sent update 4: probe_complete=true
>
>
> Bauer Corporate Services UK LP (BCS) is a division of the Bauer Media Group
> the
> largest consumer publisher in the UK, and second largest commercial radio
> broadcaster. BCS provides financial services and manages and develops IT
> systems
> on which our UK publishing, broadcast, digital and partner businesses
> depend.
>
> The information in this email is intended only for the addressee(s) named
> above.
> Access to this email by anyone else is unauthorised. If you are not the
> intended
> recipient of this message any disclosure, copying, distribution or any
> action
> taken in reliance on it is prohibited and may be unlawful. Bauer Corporate
> Services do not warrant that any attachments are free from viruses or other
> defects and accept no liability for any losses resulting from infected email
> transmissions.
>
> Please note that any views expressed in this email may be those of the
> originator and do not necessarily reflect those of this organisation.
>
> Bauer Corporate Services UK LP is registered in England; Registered address
> is
> 1 Lincoln Court, Lincoln Road, Peterborough, PE1 2RF.
>
> Registration number LP13195
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
>



--
Dan Frincu
CCNA, RHCE

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


misch at clusterbau

Apr 20, 2012, 3:37 AM

Post #3 of 4 (355 views)
Permalink
Re: Corosync / Pacemaker Cluster crashing [In reply to]

> Hi
>
> I have the following cluster setup:
(...)
> When the server crashes I get this in the log:
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE

Did you by chance not open the firewall for corosync communication?

--
Dr. Michael Schwartzkopff
Guardinistr. 63
81375 Mnchen

Tel: (0163) 172 50 98
Attachments: signature.asc (0.19 KB)


andreas at hastexo

Apr 20, 2012, 4:22 AM

Post #4 of 4 (386 views)
Permalink
Re: Corosync / Pacemaker Cluster crashing [In reply to]

On 04/20/2012 12:08 PM, Bensch, Kobus wrote:
> Hi
>
> I have the following cluster setup:
>
> 2 physical Dell servers with RHEL6.2 with all the latest patches.
>
> Each server has 3 network connections that looks like this:
>
> BOND02 NIC's
>
> ETH4 for Corosync
> ETH6 for corosync
>
> This is the corosync config:
> Cocorsync.conf
> aisexec {
> group:root
> user:root
> }
>
> compatibility: whitetank
> service {
> use_mgmtd:yes
> use_logd:yes
> ver:0
> name:pacemaker
> }

you also specified that service in /etc/corosync/service.d/pcmk ...
remove one of them ... even better: remove that definition above and
install Pacemaker 1.1.6 and Corosync 1.4.x packages that are available
as technology preview in RHEL 6.2

> totem {
> rrp_mode:active
> join:180
> max_messages:20
> vsftype:none
> token:5000
> consensus:6000
> secauth:on
> token_retransmits_before_loss_const:10
> threads:0
> #threads:16
> version:2
> interface {
> bindnetaddr:10.255.1.0
> mcastaddr:232.10.1.1
> mcastport:5405
> ringnumber:0
> ttl:1
> }
> interface {
> bindnetaddr:10.255.2.0
> mcastaddr:232.10.2.1
> mcastport:5405
> ringnumber:1
> ttl:1
> }
> clear_node_high_bit:yes
> }
> logging {
> to_logfile:yes
> to_syslog:yes
> debug:off
> timestamp:on
> logfile: /var/log/cluster/corosync.log
> to_stderr:no
> fileline:off
> syslog_facility:daemon
> }
> amf {
> mode:disabled
> }
>
> The pacemaker plugin:
> /etc/corosync/service.d/pcmk
> service {
> # Load the Pacemaker Cluster Resource Manager
> name: pacemaker
> ver: 1
> }
>
> Corosync keeps crashing when I try to do anything in the crm cli.
> Whether it is moving resources, creating resources, it does not matter.
>
> The corosync config for now is very simple and looks like this:
> node lxdcv01nd01
> node lxdcv01nd02
> primitive lcdcv01 ocf:heartbeat:IPaddr2 \
> params ip="10.1.0.95" cidr_netmask="32" \
> op monitor interval="30s"
> primitive local-manage ocf:heartbeat:IPaddr2 \
> params ip="127.0.2.1" cidr_netmask="32" \
> op monitor interval="30s"
> location cli-prefer-lcdcv01 lcdcv01 \
> rule $id="cli-prefer-rule-lcdcv01" inf: #uname eq lxdcv01nd02
> location cli-prefer-local-manage local-manage \
> rule $id="cli-prefer-rule-local-manage" inf: #uname eq lxdcv01nd02
> property $id="cib-bootstrap-options" \
> dc-version="1.0.12-unknown" \
> cluster-infrastructure="openais" \
> expected-quorum-votes="2" \
> stonith-enabled="false" \
> no-quorum-policy="ignore"
>
> I tried to disable various config lines but still no joy. Any help would
> be appreciated.
>
> When the server crashes I get this in the log:
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:17 corosync [TOTEM ] FAILED TO RECEIVE

There have been problems with delayed mcast messages that could lead to
such errors, though that has been in older corosync versions ... should
not happen in recent corosync versions. See
http://answerpot.com/showthread.php?1361794-corosync+crashes

Another point for upgrading to recent versions ;-)

Regards,
Andreas

--
Need help with Pacemaker?
http://www.hastexo.com/now

> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:18 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:19 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:20 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:21 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:22 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:22 corosync [TOTEM ] FAILED TO RECEIVE
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]:
> ERROR: ais_dispatch: Receiving message body failed: (2) Library error:
> Resource temporarily unavailable (11)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]:
> ERROR: ais_dispatch: Receiving message body failed: (2) Library error:
> Invalid argument (22)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]:
> ERROR: ais_dispatch: Receiving message body failed: (2) Library error:
> Resource temporarily unavailable (11)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]:
> ERROR: ais_dispatch: Receiving message body failed: (2) Library error:
> Resource temporarily unavailable (11)
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]:
> ERROR: ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]:
> ERROR: ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]:
> ERROR: ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [21259]:
> ERROR: crm_ais_destroy: AIS connection terminated
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]:
> ERROR: ais_dispatch: AIS connection failed
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [21254]:
> ERROR: AIS connection terminated
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]:
> CRIT: attrd_ais_destroy: Lost connection to OpenAIS service!
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group cib: [21255]:
> ERROR: cib_ais_destroy: AIS connection terminated
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]:
> info: main: Exiting...
> Apr 20 10:54:22 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [21257]:
> ERROR: attrd_cib_connection_destroy: Connection to the CIB terminated...
> Apr 20 10:54:36 corosync [MAIN ] Corosync Cluster Engine ('1.2.7'):
> started and ready to provide service.
> Apr 20 10:54:36 corosync [MAIN ] Corosync built-in features: nss rdma
> Apr 20 10:54:36 corosync [MAIN ] Successfully read main configuration
> file '/etc/corosync/corosync.conf'.
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transport (UDP/IP).
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transmit/receive
> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transport (UDP/IP).
> Apr 20 10:54:36 corosync [TOTEM ] Initializing transmit/receive
> security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
> Apr 20 10:54:36 corosync [TOTEM ] The network interface [10.255.1.1] is
> now up.
> Apr 20 10:54:36 corosync [pcmk ] info: process_ais_conf: Reading configure
> Set r/w permissions for uid=0, gid=0 on /var/log/cluster/corosync.log
> Apr 20 10:54:36 corosync [pcmk ] info: config_find_init: Local handle:
> 5650605097994944514 for logging
> Apr 20 10:54:36 corosync [pcmk ] info: config_find_next: Processing
> additional logging options...
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'off' for
> option: debug
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for
> option: to_logfile
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found
> '/var/log/cluster/corosync.log' for option: logfile
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for
> option: to_syslog
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'daemon'
> for option: syslog_facility
> Apr 20 10:54:36 corosync [pcmk ] info: config_find_init: Local handle:
> 2730409743423111171 for service
> Apr 20 10:54:36 corosync [pcmk ] info: config_find_next: Processing
> additional service options...
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Defaulting to
> 'pcmk' for option: clustername
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for
> option: use_logd
> Apr 20 10:54:36 corosync [pcmk ] info: get_config_opt: Found 'yes' for
> option: use_mgmtd
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: CRM: Initialized
> Apr 20 10:54:36 corosync [pcmk ] Logging: Initialized pcmk_startup
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: Maximum core file
> size is: 18446744073709551615
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: Service: 9
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_startup: Local hostname:
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_update_nodeid: Local node
> id: 16908042
> Apr 20 10:54:36 corosync [pcmk ] info: update_member: Creating entry
> for node 16908042 born on 0
> Apr 20 10:54:36 corosync [pcmk ] info: update_member: 0x18db8e0 Node
> 16908042 now known as lxdcv01nd01.bauer-uk.bauermedia.group (was: (null))
> Apr 20 10:54:36 corosync [pcmk ] info: update_member: Node
> lxdcv01nd01.bauer-uk.bauermedia.group now has 1 quorum votes (was 0)
> Apr 20 10:54:36 corosync [pcmk ] info: update_member: Node
> 16908042/lxdcv01nd01.bauer-uk.bauermedia.group is now: member
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22445
> for process stonithd
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22446
> for process cib
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22447
> for process lrmd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [21256]:
> info: lrmd is shutting down
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> WARN: Initializing connection to logging daemon failed. Logging daemon
> may not be running
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> WARN: Initializing connection to logging daemon failed. Logging daemon
> may not be running
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22448
> for process attrd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> WARN: Initializing connection to logging daemon failed. Logging daemon
> may not be running
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: G_main_add_SignalHandler: Added signal handler for signal 10
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: Signal sent to pid=21256, waiting for process to exit
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22449
> for process pengine
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> WARN: Initializing connection to logging daemon failed. Logging daemon
> may not be running
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: Invoked: /usr/lib64/heartbeat/cib
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: G_main_add_SignalHandler: Added signal handler for signal 12
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> WARN: Initializing connection to logging daemon failed. Logging daemon
> may not be running
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22450
> for process crmd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: Invoked: /usr/lib64/heartbeat/attrd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: G_main_add_TriggerHandler: Added signal manual handler
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> info: Invoked: /usr/lib64/heartbeat/pengine
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> WARN: Initializing connection to logging daemon failed. Logging daemon
> may not be running
> Apr 20 10:54:36 corosync [pcmk ] info: spawn_child: Forked child 22451
> for process mgmtd
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: main: Starting up
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> WARN: main: Terminating previous PE instance
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: retrieveCib: Reading cluster configuration from:
> /var/lib/heartbeat/crm/cib.xml (digest: /var/lib/heartbeat/crm/cib.xml.sig)
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: Pacemaker
> Cluster Manager 1.0.12
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [21258]:
> WARN: process_pe_message: Received quit message, terminating
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: Invoked: /usr/lib64/heartbeat/crmd
> Apr 20 10:54:36 corosync [SERV ] Service failed to load 'pacemaker'.
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: main: CRM Hg Version: unknown
>
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync
> extended virtual synchrony service
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync
> configuration service
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crmd_init: Starting crmd
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync
> cluster closed process group service v1.01
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync
> cluster config database access v1.01
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync
> profile loading service
> Apr 20 10:54:36 corosync [SERV ] Service engine loaded: corosync
> cluster quorum service v0.1
> Apr 20 10:54:36 corosync [MAIN ] Compatibility mode set to whitetank.
> Using V1 and V2 of the synchronization engine.
> Apr 20 10:54:36 corosync [TOTEM ] The network interface [10.255.2.1] is
> now up.
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: init_ais_connection_once: AIS connection established
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: startCib: CIB Initialization completed successfully
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: init_ais_connection_once: AIS connection established
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Recorded connection
> 0x18e7150 for stonithd/22445
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Recorded connection
> 0x18eb4b0 for attrd/22448
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has
> id: 16908042
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has
> id: 16908042
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: main: Cluster connection active
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: main: Accepting attribute updates
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: main: Starting mainloop...
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> notice: /usr/lib64/heartbeat/stonithd start up successfully.
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group stonithd: [22445]:
> info: G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: init_ais_connection_once: AIS connection established
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Recorded connection
> 0x18ef810 for cib/22446
> Apr 20 10:54:36 corosync [pcmk ] info: update_member: Node
> lxdcv01nd01.bauer-uk.bauermedia.group now has process list:
> 00000000000000000000000000053312 (340754)
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_ipc: Sending membership
> update 0 to cib
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has
> id: 16908042
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: cib_init: Starting cib mainloop
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: ais_dispatch: Membership 0: quorum still lost
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group:
> id=16908042 state=member (new) addr=(null) votes=1 (new) born=0 seen=0
> proc=00000000000000000000000000053312 (new)
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]:
> info: write_cib_contents: Archived previous version as
> /var/lib/heartbeat/crm/cib-80.raw
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]:
> info: write_cib_contents: Wrote version 0.89.0 of the CIB to disk
> (digest: e15d151e0fed09d1d411b21b345a8952)
> Apr 20 10:54:36 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22455]:
> info: retrieveCib: Reading cluster configuration from:
> /var/lib/heartbeat/crm/cib.cZHXQX (digest:
> /var/lib/heartbeat/crm/cib.U3NqAd)
> Apr 20 10:54:36 corosync [TOTEM ] Incrementing problem counter for seqid
> 1 iface 10.255.2.1 to [1 of 10]
> Apr 20 10:54:36 corosync [pcmk ] notice: pcmk_peer_update: Transitional
> membership event on ring 208: memb=0, new=0, lost=0
> Apr 20 10:54:36 corosync [pcmk ] notice: pcmk_peer_update: Stable
> membership event on ring 208: memb=1, new=1, lost=0
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_peer_update: NEW:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:36 corosync [pcmk ] info: pcmk_peer_update: MEMB:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:36 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Apr 20 10:54:36 corosync [MAIN ] Completed service synchronization,
> ready to provide service.
> Apr 20 10:54:37 corosync [pcmk ] notice: pcmk_peer_update: Transitional
> membership event on ring 212: memb=1, new=0, lost=0
> Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: memb:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:37 corosync [pcmk ] notice: pcmk_peer_update: Stable
> membership event on ring 212: memb=2, new=1, lost=0
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: Creating entry
> for node 33685258 born on 212
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node
> 33685258/unknown is now: member
> Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: NEW:
> .pending. 33685258
> Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: MEMB:
> lxdcv01nd01.bauer-uk.bauermedia.group 16908042
> Apr 20 10:54:37 corosync [pcmk ] info: pcmk_peer_update: MEMB:
> .pending. 33685258
> Apr 20 10:54:37 corosync [pcmk ] info: send_member_notification:
> Sending membership update 212 to 1 children
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: 0x18db8e0 Node
> 16908042 (lxdcv01nd01.bauer-uk.bauermedia.group) born on: 212
> Apr 20 10:54:37 corosync [TOTEM ] A processor joined or left the
> membership and a new membership was formed.
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: ais_dispatch: Membership 212: quorum still lost
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: 0x18e6ac0 Node
> 33685258 ((null)) born on: 196
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_new_peer: Node <null> now has id: 33685258
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: 0x18e6ac0 Node
> 33685258 now known as lxdcv01nd02.bauer-uk.bauermedia.group (was: (null))
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_update_peer: Node (null): id=33685258 state=member (new)
> addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2) votes=0 born=0 seen=212
> proc=00000000000000000000000000000000
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node
> lxdcv01nd02.bauer-uk.bauermedia.group now has process list:
> 00000000000000000000000000013312 (78610)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group:
> id=16908042 state=member addr=r(0) ip(10.255.1.1) r(1) ip(10.255.2.1)
> (new) votes=1 born=0 seen=212 proc=00000000000000000000000000053312
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node
> lxdcv01nd02.bauer-uk.bauermedia.group now has 1 quorum votes (was 0)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> notice: ais_dispatch: Membership 212: quorum acquired
> Apr 20 10:54:37 corosync [pcmk ] info: send_member_notification:
> Sending membership update 212 to 1 children
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_get_peer: Node 33685258 is now known as
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending
> message to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: crm_update_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group:
> id=33685258 state=member addr=r(0) ip(10.255.1.2) r(1) ip(10.255.2.2)
> votes=1 (new) born=196 seen=212 proc=00000000000000000000000000013312 (new)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: cib_process_diff: Diff 0.91.3 -> 0.91.4 not applied to 0.89.0:
> current "epoch" is less than required
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: cib_server_process_diff: Requesting re-sync from peer
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> WARN: cib_diff_notify: Local-only Change (client:crmd, call: 77):
> -1.-1.-1 (Application of an update diff failed, requesting a full refresh)
> Apr 20 10:54:37 corosync [MAIN ] Completed service synchronization,
> ready to provide service.
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> WARN: cib_server_process_diff: Not applying diff 0.91.4 -> 0.91.5 (sync
> in progress)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> WARN: cib_server_process_diff: Not applying diff 0.91.5 -> 0.91.6 (sync
> in progress)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> WARN: cib_server_process_diff: Not applying diff 0.91.6 -> 0.92.1 (sync
> in progress)
> Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending
> message to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22446]:
> info: cib_replace_notify: Local-only Replace: -1.-1.-1 from
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]:
> info: write_cib_contents: Archived previous version as
> /var/lib/heartbeat/crm/cib-81.raw
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]:
> info: write_cib_contents: Wrote version 0.92.0 of the CIB to disk
> (digest: 65cf2f5895618dbd08c40b8c39a479c5)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group cib: [22456]:
> info: retrieveCib: Reading cluster configuration from:
> /var/lib/heartbeat/crm/cib.8nhla0 (digest:
> /var/lib/heartbeat/crm/cib.nUDbdi)
> Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending
> message to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 corosync [pcmk ] ERROR: pcmk_wait_dispatch: Child
> process mgmtd exited (pid=22451, rc=100)
> Apr 20 10:54:37 corosync [pcmk ] notice: pcmk_wait_dispatch: Child
> process mgmtd no longer wishes to be respawned
> Apr 20 10:54:37 corosync [pcmk ] info: update_member: Node
> lxdcv01nd01.bauer-uk.bauermedia.group now has process list:
> 00000000000000000000000000013312 (78610)
> Apr 20 10:54:37 corosync [pcmk ] WARN: route_ais_message: Sending
> message to local.crmd failed: unknown (rc=-2)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: G_main_add_SignalHandler: Added signal handler for signal 15
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: G_main_add_SignalHandler: Added signal handler for signal 17
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: enabling coredumps
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: G_main_add_SignalHandler: Added signal handler for signal 10
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: G_main_add_SignalHandler: Added signal handler for signal 12
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: Started.
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_cib_control: CIB connection established
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_cluster_connect: Connecting to OpenAIS
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: init_ais_connection_once: Creating connection to our AIS plugin
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: init_ais_connection_once: AIS connection established
> Apr 20 10:54:37 corosync [pcmk ] info: pcmk_ipc: Recorded connection
> 0x18f53c0 for crmd/22450
> Apr 20 10:54:37 corosync [pcmk ] info: pcmk_ipc: Sending membership
> update 212 to crmd
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: get_ais_nodeid: Server details: id=16908042
> uname=lxdcv01nd01.bauer-uk.bauermedia.group cname=pcmk
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_new_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group now has
> id: 16908042
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_new_peer: Node 16908042 is now known as
> lxdcv01nd01.bauer-uk.bauermedia.group
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_ha_control: Connected to the cluster
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_started: Delaying start, CCM (0000000000100000) not connected
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crmd_init: Starting crmd's mainloop
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: config_query_callback: Checking for expired actions every 900000ms
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: config_query_callback: Sending expected-votes=2 to corosync
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> notice: ais_dispatch: Membership 212: quorum acquired
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_new_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group now has
> id: 33685258
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_new_peer: Node 33685258 is now known as
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_update_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group:
> id=33685258 state=member (new) addr=r(0) ip(10.255.1.2) r(1)
> ip(10.255.2.2) votes=1 born=196 seen=212
> proc=00000000000000000000000000013312
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: crm_update_peer: Node lxdcv01nd01.bauer-uk.bauermedia.group:
> id=16908042 state=member (new) addr=r(0) ip(10.255.1.1) r(1)
> ip(10.255.2.1) (new) votes=1 (new) born=212 seen=212
> proc=00000000000000000000000000013312 (new)
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_started: The local CRM is operational
> Apr 20 10:54:37 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_state_transition: State transition S_STARTING -> S_PENDING [
> input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ]
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group pengine: [22449]:
> info: main: Starting pengine
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: ais_dispatch: Membership 212: quorum retained
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: update_dc: Set DC to lxdcv01nd02.bauer-uk.bauermedia.group (3.0.1)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: update_attrd: Connecting to attrd...
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: find_hash_entry: Creating hash entry for terminate
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_state_transition: State transition S_PENDING -> S_NOT_DC [
> input=I_NOT_DC cause=C_HA_MESSAGE origin=do_cl_join_finalize_respond ]
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: find_hash_entry: Creating hash entry for shutdown
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_local_callback: Sending full refresh (origin=crmd)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: erase_xpath_callback: Deletion of
> "//node_state[@uname='lxdcv01nd01.bauer-uk.bauermedia.group']/transient_attributes":
> ok (rc=0)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: crm_new_peer: Node lxdcv01nd02.bauer-uk.bauermedia.group now has
> id: 33685258
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: crm_new_peer: Node 33685258 is now known as
> lxdcv01nd02.bauer-uk.bauermedia.group
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: find_hash_entry: Creating hash entry for probe_complete
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_perform_update: Delaying operation probe_complete=<null>:
> cib not connected
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_lrm_rsc_op: Performing
> key=6:4:7:e6a3b9c7-c24d-497a-9c07-d6082ee231a9 op=local-manage_monitor_0 )
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: rsc:local-manage:2: probe
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_lrm_rsc_op: Performing
> key=7:4:7:e6a3b9c7-c24d-497a-9c07-d6082ee231a9 op=lcdcv01_monitor_0 )
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: rsc:lcdcv01:3: probe
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: process_lrm_event: LRM operation lcdcv01_monitor_0 (call=3, rc=0,
> cib-update=7, confirmed=true) ok
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: process_lrm_event: LRM operation local-manage_monitor_0 (call=2,
> rc=7, cib-update=8, confirmed=true) not running
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_trigger_update: Sending flush op to all hosts for:
> probe_complete (true)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_perform_update: Delaying operation probe_complete=true: cib
> not connected
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: do_lrm_rsc_op: Performing
> key=9:5:0:e6a3b9c7-c24d-497a-9c07-d6082ee231a9 op=lcdcv01_stop_0 )
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: rsc:lcdcv01:4: stop
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_trigger_update: Sending flush op to all hosts for:
> probe_complete (true)
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_perform_update: Delaying operation probe_complete=true: cib
> not connected
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group lrmd: [22447]:
> info: RA output: (lcdcv01:stop:stderr) logd is not running
> Apr 20 10:54:38 lxdcv01nd01.bauer-uk.bauermedia.group crmd: [22450]:
> info: process_lrm_event: LRM operation lcdcv01_stop_0 (call=4, rc=0,
> cib-update=9, confirmed=true) ok
> Apr 20 10:54:38 corosync [TOTEM ] ring 1 active with no faults
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: cib_connect: Connected to the CIB after 1 signon attempts
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: cib_connect: Sending full refresh
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_trigger_update: Sending flush op to all hosts for:
> probe_complete (true)
> Apr 20 10:54:41 lxdcv01nd01.bauer-uk.bauermedia.group attrd: [22448]:
> info: attrd_perform_update: Sent update 4: probe_complete=true
>
>
> Bauer Corporate Services UK LP (BCS) is a division of the Bauer Media Group the
> largest consumer publisher in the UK, and second largest commercial radio
> broadcaster. BCS provides financial services and manages and develops IT systems
> on which our UK publishing, broadcast, digital and partner businesses depend.
>
> The information in this email is intended only for the addressee(s) named above.
> Access to this email by anyone else is unauthorised. If you are not the intended
> recipient of this message any disclosure, copying, distribution or any action
> taken in reliance on it is prohibited and may be unlawful. Bauer Corporate
> Services do not warrant that any attachments are free from viruses or other
> defects and accept no liability for any losses resulting from infected email
> transmissions.
>
> Please note that any views expressed in this email may be those of the
> originator and do not necessarily reflect those of this organisation.
>
> Bauer Corporate Services UK LP is registered in England; Registered address is
> 1 Lincoln Court, Lincoln Road, Peterborough, PE1 2RF.
>
> Registration number LP13195
>
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org
Attachments: signature.asc (0.22 KB)

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.