Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

unpack_rsc_op: Hard error

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


pavlos.parissis at gmail

Oct 9, 2010, 2:20 PM

Post #1 of 6 (1355 views)
Permalink
unpack_rsc_op: Hard error

Hi,

Does anyone know why PE wants to unpack resources on nodes that will
never run due to location constraints?
I am getting this messages and I am wondering if they harmless or not.

23:12:38 pengine: [7705]: notice: unpack_rsc_op: Hard error -
sshd-pbx_01_monitor_0 failed with rc=5: Preventing sshd-pbx_01 from
re-starting on node-02
23:12:38 pengine: [7705]: notice: unpack_rsc_op: Hard error -
pbx_01_monitor_0 failed with rc=5: Preventing pbx_01 from re-starting
on node-02

Cheers,
Pavlos

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


pavlos.parissis at gmail

Oct 10, 2010, 4:30 AM

Post #2 of 6 (1327 views)
Permalink
Re: unpack_rsc_op: Hard error [In reply to]

On 9 October 2010 23:20, Pavlos Parissis <pavlos.parissis [at] gmail> wrote:
> Hi,
>
> Does anyone know why PE wants to unpack resources on nodes that will
> never run due to location constraints?
> I am getting this messages and I am wondering if they harmless or not.
>
> 23:12:38 pengine: [7705]: notice: unpack_rsc_op: Hard error -
> sshd-pbx_01_monitor_0 failed with rc=5: Preventing sshd-pbx_01 from
> re-starting on node-02
> 23:12:38 pengine: [7705]: notice: unpack_rsc_op: Hard error -
> pbx_01_monitor_0 failed with rc=5: Preventing pbx_01 from re-starting
> on node-02
>
> Cheers,
> Pavlos
>

It seams that return code of 5 from a LSB script confuses the cluster.
I have made my init script to be LSB compliant, it passes the tests
here[1], but I have also implemented what it is mentioned here [2]
regarding the exit codes.
I have implemented the exit code 5 which causes troubles because when
the cluster run the monitor on the slave node, where no resources are
active, gets rc=5.
If I remove the exit 5 everything is fine. Is this a expected behavior?


[1]http://www.clusterlabs.org/doc/en-US/Pacemaker/1.0/html/Pacemaker_Explained/ap-lsb.html

[2]http://refspecs.freestandards.org/LSB_3.1.0/LSB-Core-generic/LSB-Core-generic/iniscrptact.html

the init script
[root [at] node-0 ~]# cat /etc/init.d/znd-pbx_01
#!/bin/bash
#
### BEGIN INIT INFO
# Provides: pbx_01
# Required-Start: $local_fs $network
# Required-Stop: $local_fs $network
# Default-Start: 3 4 5
# Default-Stop: 0 1 2 6
# Short-Description: start and stop pbx_01
# Description: Init script fro pbxnsip.
### END INIT INFO

# source function library
. /etc/init.d/functions

RETVAL=0

# Installation location
INSTALLDIR=/pbx_service_01/pbxnsip
PBX_CONFIG=$INSTALLDIR/pbx.xml
PBX=pbx_01
PID_FILE=/var/run/$PBX.pid
LOCK_FILE=/var/lock/subsys/$PBX
PBX_OPTIONS="--dir $INSTALLDIR --config $PBX_CONFIG --pidfile $PID_FILE"

#sleep 10;

#[ -x $INSTALLDIR/$PBX ] || exit 5

start()
{
echo -n "Starting PBX: "
daemon --pidfile $PID_FILE $INSTALLDIR/$PBX $PBX_OPTIONS
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && touch $LOCK_FILE
return $RETVAL

}
stop()
{
echo -n "Stopping PBX: "
killproc -p $PID_FILE $PBX
RETVAL=$?
echo
[ $RETVAL -eq 0 ] && rm -f $LOCK_FILE
return $RETVAL
}

case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
force-reload)
stop
start
;;
status)
status -p $PID_FILE $PBX
RETVAL=$?
;;
*)
echo $"Usage: $0 {start|stop|restart|force-reload|status}"
exit 2
esac
exit $RETVAL
[root [at] node-0 ~]#

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


andrew at beekhof

Oct 10, 2010, 8:39 AM

Post #3 of 6 (1325 views)
Permalink
Re: unpack_rsc_op: Hard error [In reply to]

On Sat, Oct 9, 2010 at 11:20 PM, Pavlos Parissis
<pavlos.parissis [at] gmail> wrote:
> Hi,
>
> Does anyone know why PE wants to unpack resources on nodes that will
> never run due to location constraints?

Because part of its job is to make sure they dont run there.

> I am getting this messages and I am wondering if they harmless or not.

Basically yes. We've since reduced this to an informational message.

>
> 23:12:38 pengine: [7705]: notice: unpack_rsc_op: Hard error -
> sshd-pbx_01_monitor_0 failed with rc=5: Preventing sshd-pbx_01 from
> re-starting on node-02
> 23:12:38 pengine: [7705]: notice: unpack_rsc_op: Hard error -
> pbx_01_monitor_0 failed with rc=5: Preventing pbx_01 from re-starting
> on node-02
>
> Cheers,
> Pavlos
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


pavlos.parissis at gmail

Oct 11, 2010, 2:25 AM

Post #4 of 6 (1315 views)
Permalink
Re: unpack_rsc_op: Hard error [In reply to]

On 10 October 2010 17:39, Andrew Beekhof <andrew [at] beekhof> wrote:
> On Sat, Oct 9, 2010 at 11:20 PM, Pavlos Parissis
> <pavlos.parissis [at] gmail> wrote:
>> Hi,
>>
>> Does anyone know why PE wants to unpack resources on nodes that will
>> never run due to location constraints?
>
> Because part of its job is to make sure they dont run there.
>
>> I am getting this messages and I am wondering if they harmless or not.
>
> Basically yes.  We've since reduced this to an informational message.
>
So, it is not necessary to place the LSB script of a resource to nodes
where the resource will never run, due to location constraints.Am I
right?

Cheers,
Pavlos

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


andrew at beekhof

Oct 19, 2010, 5:16 AM

Post #5 of 6 (1281 views)
Permalink
Re: unpack_rsc_op: Hard error [In reply to]

On Mon, Oct 11, 2010 at 11:25 AM, Pavlos Parissis
<pavlos.parissis [at] gmail> wrote:
> On 10 October 2010 17:39, Andrew Beekhof <andrew [at] beekhof> wrote:
>> On Sat, Oct 9, 2010 at 11:20 PM, Pavlos Parissis
>> <pavlos.parissis [at] gmail> wrote:
>>> Hi,
>>>
>>> Does anyone know why PE wants to unpack resources on nodes that will
>>> never run due to location constraints?
>>
>> Because part of its job is to make sure they dont run there.
>>
>>> I am getting this messages and I am wondering if they harmless or not.
>>
>> Basically yes. †We've since reduced this to an informational message.
>>
> So, it is not necessary to place the LSB script of a resource to nodes
> where the resource will never run, due to location constraints.Am I
> right?

Correct, though the probes might show up in crm_mon as "failed".

>
> Cheers,
> Pavlos
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker
>

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://developerbugs.linux-foundation.org/enter_bug.cgi?product=Pacemaker


pavlos.parissis at gmail

Oct 19, 2010, 5:33 AM

Post #6 of 6 (1279 views)
Permalink
Re: unpack_rsc_op: Hard error [In reply to]

On 19 October 2010 14:16, Andrew Beekhof <andrew [at] beekhof> wrote:

> On Mon, Oct 11, 2010 at 11:25 AM, Pavlos Parissis
> <pavlos.parissis [at] gmail> wrote:
> > On 10 October 2010 17:39, Andrew Beekhof <andrew [at] beekhof> wrote:
> >> On Sat, Oct 9, 2010 at 11:20 PM, Pavlos Parissis
> >> <pavlos.parissis [at] gmail> wrote:
> >>> Hi,
> >>>
> >>> Does anyone know why PE wants to unpack resources on nodes that will
> >>> never run due to location constraints?
> >>
> >> Because part of its job is to make sure they dont run there.
> >>
> >>> I am getting this messages and I am wondering if they harmless or not.
> >>
> >> Basically yes. We've since reduced this to an informational message.
> >>
> > So, it is not necessary to place the LSB script of a resource to nodes
> > where the resource will never run, due to location constraints.Am I
> > right?
>
> Correct, though the probes might show up in crm_mon as "failed".
>
>
>
Even it is correct, I placed the script on all nodes, just to avoid the
warnings.

Thanks,
Pavlos

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.