Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Dev

stonithd received sigabt when testing by CTS.

 

 

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded


aoshima.kentaro at nttcom

Jul 21, 2008, 11:56 PM

Post #1 of 2 (238 views)
Permalink
stonithd received sigabt when testing by CTS.

Hi,all

I'm Aoshima. I'd like to ask about CTS for 2.1.4.
We try to run CTS on RHEL5.2(x86-64) for 2.1.4 whose
changeset is "12148:e902ad7642fd" .

heartbeat and resource agent seemed to fail to start
after kicking CTSlab.py .
According to ha-log, when the lrmd procces tried to start
STONITH resource, something wrong had occured and stonithd had
received SIGABT.

When using stonith(ex: external/ssh) for CTS,
any extra configurations are needed?

best regards

our testing environment are indicated below.
---ha.cf---
crm on
logfile /var/log/ha-log
logfacility local7
keepalive 2
deadtime 30
warntime 10
initdead 60
udpport 651
auto_failback legacy
bcast eth1
bcast eth3

node cgl49
node cgl50

--ha-log--
lrmd[8899]: 2008/07/22_14:12:04 info: rsc:ocf_msdummy:0: start
lrmd[9361]: 2008/07/22_14:12:04 info: Try to start STONITH resource
<rsc_id=child_DoFencing:0> : Device=external/ssh
stonithd[8900]: 2008/07/22_14:12:04 ERROR: ipc_bufpool_update: magic
number in head does not match.Something very bad happened, abor
t now, farside pid =9361
stonithd[8900]: 2008/07/22_14:12:04 ERROR: magic=6c754a3e, expected
value=abcd
stonithd[8900]: 2008/07/22_14:12:04 info: pool: refcount=1,
startpos=0x17dc4ba8, currpos=0x17dc4c8d,consumepos=0x17dc4c14, endpos=0x
17dc5b78, size=4096
stonithd[8900]: 2008/07/22_14:12:04 info: nmsgs=0
lrmd[8899]: 2008/07/22_14:12:04 info: rsc:ocf_msdummy:1: start
lrmd[9361]: 2008/07/22_14:12:04 ERROR: stonithd_virtual_stonithRA_ops:
failed to fetch reply
tengine[8951]: 2008/07/22_14:12:04 ERROR: stonithd_op_result_ready: not
signed on
heartbeat[8887]: 2008/07/22_14:12:04 WARN: Managed
/usr/lib64/heartbeat/stonithd process 8900 killed by signal 6 [SIGABRT -
Abort].
lrmd[9361]: 2008/07/22_14:12:04 ERROR: sending stonithRA op to stonithd
failed.
tengine[8951]: 2008/07/22_14:12:04 ERROR:
tengine_stonith_connection_destroy: Fencing daemon has left us
heartbeat[8887]: 2008/07/22_14:12:04 ERROR: Managed
/usr/lib64/heartbeat/stonithd process 8900 dumped core
lrmd[9361]: 2008/07/22_14:12:04 notice: Not currently connected.
tengine[8951]: 2008/07/22_14:12:04 info: te_connect_stonith: Attempting
connection to fencing daemon...
heartbeat[8887]: 2008/07/22_14:12:04 ERROR: Respawning client
"/usr/lib64/heartbeat/stonithd":
heartbeat[8887]: 2008/07/22_14:12:04 info: Starting child client
"/usr/lib64/heartbeat/stonithd" (0,0)
lrmd[8899]: 2008/07/22_14:12:04 WARN: mapped the invalid return code 254.
crmd[8902]: 2008/07/22_14:12:04 info: process_lrm_event: LRM operation
child_DoFencing:0_start_0 (call=19, rc=1) complete
heartbeat[9368]: 2008/07/22_14:12:04 info: Starting
"/usr/lib64/heartbeat/stonithd" as uid 0 gid 0 (pid 9368)
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


aoshima.kentaro at nttcom

Jul 27, 2008, 9:17 PM

Post #2 of 2 (177 views)
Permalink
Re: stonithd received sigabt when testing by CTS. [In reply to]

Hi,all

We would like to inform you that we've completed
500 times CTS Test for "changeset: 12158: a8b2fc037b29"
on RHEL5.2(x86-64).

That changeset includes some patches(changeset 12153) for LRM
by Dejan. The issue we had reported before this did not
happened and the log attached showed "success" on all tests.

we are not familiar with CTS and I wonder if CTS test on SUSE or any
other OS shows the same results as we had.
if anyone did run CTS, please let me know the result.

Best regards

> Hi,all
>
> I'm Aoshima. I'd like to ask about CTS for 2.1.4.
> We try to run CTS on RHEL5.2(x86-64) for 2.1.4 whose
> changeset is "12148:e902ad7642fd" .
>
> heartbeat and resource agent seemed to fail to start
> after kicking CTSlab.py .
> According to ha-log, when the lrmd procces tried to start
> STONITH resource, something wrong had occured and stonithd had
> received SIGABT.
>
> When using stonith(ex: external/ssh) for CTS,
> any extra configurations are needed?
>
> best regards
>
> our testing environment are indicated below.
> ---ha.cf---
> crm on
> logfile /var/log/ha-log
> logfacility local7
> keepalive 2
> deadtime 30
> warntime 10
> initdead 60
> udpport 651
> auto_failback legacy
> bcast eth1
> bcast eth3
>
> node cgl49
> node cgl50
>
> --ha-log--
> lrmd[8899]: 2008/07/22_14:12:04 info: rsc:ocf_msdummy:0: start
> lrmd[9361]: 2008/07/22_14:12:04 info: Try to start STONITH resource
> <rsc_id=child_DoFencing:0> : Device=external/ssh
> stonithd[8900]: 2008/07/22_14:12:04 ERROR: ipc_bufpool_update: magic
> number in head does not match.Something very bad happened, abor
> t now, farside pid =9361
> stonithd[8900]: 2008/07/22_14:12:04 ERROR: magic=6c754a3e, expected
> value=abcd
> stonithd[8900]: 2008/07/22_14:12:04 info: pool: refcount=1,
> startpos=0x17dc4ba8, currpos=0x17dc4c8d,consumepos=0x17dc4c14, endpos=0x
> 17dc5b78, size=4096
> stonithd[8900]: 2008/07/22_14:12:04 info: nmsgs=0
> lrmd[8899]: 2008/07/22_14:12:04 info: rsc:ocf_msdummy:1: start
> lrmd[9361]: 2008/07/22_14:12:04 ERROR: stonithd_virtual_stonithRA_ops:
> failed to fetch reply
> tengine[8951]: 2008/07/22_14:12:04 ERROR: stonithd_op_result_ready: not
> signed on
> heartbeat[8887]: 2008/07/22_14:12:04 WARN: Managed
> /usr/lib64/heartbeat/stonithd process 8900 killed by signal 6 [SIGABRT -
> Abort].
> lrmd[9361]: 2008/07/22_14:12:04 ERROR: sending stonithRA op to stonithd
> failed.
> tengine[8951]: 2008/07/22_14:12:04 ERROR:
> tengine_stonith_connection_destroy: Fencing daemon has left us
> heartbeat[8887]: 2008/07/22_14:12:04 ERROR: Managed
> /usr/lib64/heartbeat/stonithd process 8900 dumped core
> lrmd[9361]: 2008/07/22_14:12:04 notice: Not currently connected.
> tengine[8951]: 2008/07/22_14:12:04 info: te_connect_stonith: Attempting
> connection to fencing daemon...
> heartbeat[8887]: 2008/07/22_14:12:04 ERROR: Respawning client
> "/usr/lib64/heartbeat/stonithd":
> heartbeat[8887]: 2008/07/22_14:12:04 info: Starting child client
> "/usr/lib64/heartbeat/stonithd" (0,0)
> lrmd[8899]: 2008/07/22_14:12:04 WARN: mapped the invalid return code 254.
> crmd[8902]: 2008/07/22_14:12:04 info: process_lrm_event: LRM operation
> child_DoFencing:0_start_0 (call=19, rc=1) complete
> heartbeat[9368]: 2008/07/22_14:12:04 info: Starting
> "/usr/lib64/heartbeat/stonithd" as uid 0 gid 0 (pid 9368)
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
>
>
Attachments: ctslab_20080728.log (33.0 KB)

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact lists@gossamer-threads.com
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.