Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Pacemaker

Why monitor fails in my RA

 

 

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded


test123 at implix

Apr 25, 2012, 1:41 PM

Post #1 of 8 (1205 views)
Permalink
Why monitor fails in my RA

Hi,

I try to write redis resources agent working in master-slave. My
configuration:
node s1
node s2
primitive ip-redis ocf:heartbeat:IPaddr2 \
params ip="192.168.1.15" nic="eth0" cidr_netmask="24" \
op monitor interval="10s" timeout="30s" \
meta target-role="Started"
primitive redis-server ocf:implix:redis4 \
op start interval="0" timeout="60s" \
op stop interval="0" timeout="60s" \
op monitor interval="5s" role="Master" timeout="60s" \
op monitor interval="10s" role="Slave" timeout="60s" \
params masterip="192.168.1.15"
ms redis-ms redis-server \
meta master-max="1" master-node-max="1" clone-max="2" \
clone-node-max="1" target-role="Master"
colocation co-redis-ms inf: ip-redis redis-ms:Master
order or-redis inf: redis-ms:promote ip-redis:start
property $id="cib-bootstrap-options" \
dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
cluster-infrastructure="openais" \
no-quorum-policy="ignore" \
stonith-enabled="false" \
expected-quorum-votes="2" \
default-action-timeout="20s" \
last-lrm-refresh="1335271825" \
default-resource-stickiness="10"


To simplify RA all redis nodes start as a slave (that's why I need to
pass masterip in configuration).

Script works great it promote on secondary (if master node is down) but
only few times. In some point sometimes after 2 or after 3 master fails
(manually kill process) I get this error:
redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
master (failed)

My mointor function (simplified and removed overhead and added some
comments) is:
redis_monitor() {
# I set score 10 for master 5 is for slave
CURSCORE=`$CRM_MASTER -G -q`
logger "redis_monitor: score $CURSCORE"
local state
redis_state

# In RET is current local redis state
state=$(echo "${RET}" | cut -d':' -f2 | tr -d '\r')

if [ "${state}" = "master" ];then
$CRM_MASTER -v $CRM_MASTER_SCORE # score is 10
exit $OCF_RUNNING_MASTER
fi

if [ "${state}" = "slave" ];then
$CRM_MASTER -v $CRM_SLAVE_SCORE # score is 5
exit $OCF_SUCCESS
fi

# if not slave/master so resource is failed
$CRM_MASTER -l reboot -D
if [ $CURSCORE -eq $CRM_MASTER_SCORE ];then
exit $OCF_FAILED_MASTER
fi

exit $OCF_NOT_RUNNING
}

From my logs I know that monitoring function returned OCF_FAILED_MASTER
when master is down and then this error occurred:
redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
master (failed)

After that failed master node is not monitored on that node until I run
cleanup:
#crm resource cleanup redis-server:0


My questions:
1) What I'm doing wrong ?. How can I fix this.
I've tried on-fail="restart" but this not helped

2) Using older version of redis 2.3 If master failed redis is hanging
for some time (21-24 seconds). Even I set higher timeout on monitor
functions it still timeout after 20 seconds why?.
(Changing default-action-timeout to higher value helped to resolve this
but I think timeout should be enough)



--
Greg


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


andrew at beekhof

Apr 26, 2012, 6:43 PM

Post #2 of 8 (1152 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

On Thu, Apr 26, 2012 at 6:41 AM, Greg <test123 [at] implix> wrote:
>
> Hi,
>
> I try to write redis resources agent working in master-slave. My
> configuration:
> node s1
> node s2
> primitive ip-redis ocf:heartbeat:IPaddr2 \
>        params ip="192.168.1.15" nic="eth0" cidr_netmask="24" \
>        op monitor interval="10s" timeout="30s" \
>        meta target-role="Started"
> primitive redis-server ocf:implix:redis4 \
>        op start interval="0" timeout="60s" \
>        op stop interval="0" timeout="60s" \
>        op monitor interval="5s" role="Master" timeout="60s" \
>        op monitor interval="10s" role="Slave" timeout="60s" \
>        params masterip="192.168.1.15"
> ms redis-ms redis-server \
>        meta master-max="1" master-node-max="1" clone-max="2" \
>        clone-node-max="1" target-role="Master"
> colocation co-redis-ms inf: ip-redis redis-ms:Master
> order or-redis inf: redis-ms:promote ip-redis:start
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>        cluster-infrastructure="openais" \
>        no-quorum-policy="ignore" \
>        stonith-enabled="false" \
>        expected-quorum-votes="2" \
>        default-action-timeout="20s" \
>        last-lrm-refresh="1335271825" \
>        default-resource-stickiness="10"
>
>
>        To simplify RA all redis nodes start as a slave (that's why I need to
> pass masterip in configuration).
>
>        Script works great it promote on secondary (if master node is down)
> but only few times. In some point sometimes after 2 or after 3 master fails
> (manually kill process) I get this error:
> redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
> master (failed)
>
> My mointor function (simplified and removed overhead and added some
> comments) is:
> redis_monitor() {
>        # I set score 10 for master 5 is for slave
>        CURSCORE=`$CRM_MASTER -G -q`
>        logger "redis_monitor: score $CURSCORE"
>        local state
>        redis_state
>
>        # In RET is current local redis state
>        state=$(echo "${RET}" | cut -d':' -f2 | tr -d '\r')
>
>        if [ "${state}" = "master" ];then
>                $CRM_MASTER -v $CRM_MASTER_SCORE # score is 10
>                exit $OCF_RUNNING_MASTER
>        fi
>
>        if [ "${state}" = "slave" ];then
>                $CRM_MASTER -v $CRM_SLAVE_SCORE # score is 5
>                exit $OCF_SUCCESS
>        fi
>
>        # if not slave/master so resource is failed
>        $CRM_MASTER -l reboot -D
>        if [ $CURSCORE -eq $CRM_MASTER_SCORE ];then
>                exit $OCF_FAILED_MASTER
>        fi
>
>        exit $OCF_NOT_RUNNING

Are you sure its NOT_RUNNING?
Could it also be running but generally failed?

> }
>
> From my logs I know that monitoring function returned OCF_FAILED_MASTER when
> master is down and then this error occurred:
> redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
> master (failed)
>
> After that failed master node is not monitored on that node until I run
> cleanup:
> #crm resource cleanup redis-server:0
>
>
> My questions:
> 1) What I'm doing wrong ?. How can I fix this.
> I've tried on-fail="restart" but this not helped

You'd need to supply more information (in the form of a hb_report tarball).
An upgrade might not hurt either.

>
> 2) Using older version of redis 2.3 If master failed redis is hanging for
> some time (21-24 seconds). Even I set higher timeout on monitor functions it
> still timeout after 20 seconds why?.

How did you set the timeout higher?

> (Changing default-action-timeout to higher value helped to resolve this but
> I think timeout should be enough)
>
>
>
> --
> Greg
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


andrew at beekhof

Apr 26, 2012, 6:43 PM

Post #3 of 8 (1154 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

On Thu, Apr 26, 2012 at 6:41 AM, Greg <test123 [at] implix> wrote:
>
> Hi,
>
> I try to write redis resources agent working in master-slave. My
> configuration:
> node s1
> node s2
> primitive ip-redis ocf:heartbeat:IPaddr2 \
>        params ip="192.168.1.15" nic="eth0" cidr_netmask="24" \
>        op monitor interval="10s" timeout="30s" \
>        meta target-role="Started"
> primitive redis-server ocf:implix:redis4 \
>        op start interval="0" timeout="60s" \
>        op stop interval="0" timeout="60s" \
>        op monitor interval="5s" role="Master" timeout="60s" \
>        op monitor interval="10s" role="Slave" timeout="60s" \
>        params masterip="192.168.1.15"
> ms redis-ms redis-server \
>        meta master-max="1" master-node-max="1" clone-max="2" \
>        clone-node-max="1" target-role="Master"
> colocation co-redis-ms inf: ip-redis redis-ms:Master
> order or-redis inf: redis-ms:promote ip-redis:start
> property $id="cib-bootstrap-options" \
>        dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
>        cluster-infrastructure="openais" \
>        no-quorum-policy="ignore" \
>        stonith-enabled="false" \
>        expected-quorum-votes="2" \
>        default-action-timeout="20s" \
>        last-lrm-refresh="1335271825" \
>        default-resource-stickiness="10"
>
>
>        To simplify RA all redis nodes start as a slave (that's why I need to
> pass masterip in configuration).
>
>        Script works great it promote on secondary (if master node is down)
> but only few times. In some point sometimes after 2 or after 3 master fails
> (manually kill process) I get this error:
> redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
> master (failed)
>
> My mointor function (simplified and removed overhead and added some
> comments) is:
> redis_monitor() {
>        # I set score 10 for master 5 is for slave
>        CURSCORE=`$CRM_MASTER -G -q`
>        logger "redis_monitor: score $CURSCORE"
>        local state
>        redis_state
>
>        # In RET is current local redis state
>        state=$(echo "${RET}" | cut -d':' -f2 | tr -d '\r')
>
>        if [ "${state}" = "master" ];then
>                $CRM_MASTER -v $CRM_MASTER_SCORE # score is 10
>                exit $OCF_RUNNING_MASTER
>        fi
>
>        if [ "${state}" = "slave" ];then
>                $CRM_MASTER -v $CRM_SLAVE_SCORE # score is 5
>                exit $OCF_SUCCESS
>        fi
>
>        # if not slave/master so resource is failed
>        $CRM_MASTER -l reboot -D
>        if [ $CURSCORE -eq $CRM_MASTER_SCORE ];then
>                exit $OCF_FAILED_MASTER
>        fi
>
>        exit $OCF_NOT_RUNNING

Are you sure its NOT_RUNNING?
Could it also be running but generally failed?

> }
>
> From my logs I know that monitoring function returned OCF_FAILED_MASTER when
> master is down and then this error occurred:
> redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
> master (failed)
>
> After that failed master node is not monitored on that node until I run
> cleanup:
> #crm resource cleanup redis-server:0
>
>
> My questions:
> 1) What I'm doing wrong ?. How can I fix this.
> I've tried on-fail="restart" but this not helped

You'd need to supply more information (in the form of a hb_report tarball).
An upgrade might not hurt either.

>
> 2) Using older version of redis 2.3 If master failed redis is hanging for
> some time (21-24 seconds). Even I set higher timeout on monitor functions it
> still timeout after 20 seconds why?.

How did you set the timeout higher?

> (Changing default-action-timeout to higher value helped to resolve this but
> I think timeout should be enough)
>
>
>
> --
> Greg
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


test123 at implix

Apr 27, 2012, 7:18 AM

Post #4 of 8 (1170 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

On day 04/27/12 03:43, Andrew Beekhof wrote:

[cut]
>> My mointor function (simplified and removed overhead and added some
>> comments) is:
>> redis_monitor() {
>> # I set score 10 for master 5 is for slave
>> CURSCORE=`$CRM_MASTER -G -q`
>> logger "redis_monitor: score $CURSCORE"
>> local state
>> redis_state
>>
>> # In RET is current local redis state
>> state=$(echo "${RET}" | cut -d':' -f2 | tr -d '\r')
>>
>> if [ "${state}" = "master" ];then
>> $CRM_MASTER -v $CRM_MASTER_SCORE # score is 10
>> exit $OCF_RUNNING_MASTER
>> fi
>>
>> if [ "${state}" = "slave" ];then
>> $CRM_MASTER -v $CRM_SLAVE_SCORE # score is 5
>> exit $OCF_SUCCESS
>> fi
>>
>> # if not slave/master so resource is failed
>> $CRM_MASTER -l reboot -D
>> if [ $CURSCORE -eq $CRM_MASTER_SCORE ];then
>> exit $OCF_FAILED_MASTER
>> fi
>>
>> exit $OCF_NOT_RUNNING
>
> Are you sure its NOT_RUNNING?
> Could it also be running but generally failed?
redis_state function set RET variable to master for Master /slave for
slave /empty for not running or hanging. If redis is not master nor
slave so it should be restarted.


>> From my logs I know that monitoring function returned OCF_FAILED_MASTER when
>> master is down and then this error occurred:
>> redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
>> master (failed)
>>
>> After that failed master node is not monitored on that node until I run
>> cleanup:
>> #crm resource cleanup redis-server:0
>>
>>
>> My questions:
>> 1) What I'm doing wrong ?. How can I fix this.
>> I've tried on-fail="restart" but this not helped
>
> You'd need to supply more information (in the form of a hb_report tarball).
> An upgrade might not hurt either.
No new version for debina in squeeze-backports :(


Rapport attached. I've change monitor a little bit and now check state
using OCF_RESKEY_CRM_meta_role but I still has the same problems. Test
scenario is running master on s1 and after a while i kill redis on s1 .
S2 became master and after almost 2 minutes I do the same on s2 - kill
redis process. Redis on S1 became master. (that all in report).

After kill redis on S2 error occurs:
redis-server:0_monitor_5000 (node=s1, call=16, rc=9, status=complete):
master (failed)
Now if S2 became master redis on that node is never monitored again (if
S2 is slave for redis). It's very strange that this error never happen
if I kill redis for the first time on S1.


redis_monitor() {

local CURSTATE
local state
# One can use (undocumented ?)
#OCF_RESKEY_CRM_meta_role=Slave
#OCF_RESKEY_CRM_meta_role=Master
CURSTATE=$(echo ${OCF_RESKEY_CRM_meta_role} | tr [A-Z] [a-z])

logger "redis_monitor: current state: $CURSTATE"

# check redis state
redis_state
state=$(echo "${RET}" | cut -d':' -f2 | tr -d '\r')
logger "redis_monitor: redis state $state"

# CRM says redis is master:
if [ "${CURSTATE}" = "master" ];then
if [ "${state}" = "master" ];then
logger "redis_monitor 1 $OCF_RUNNING_MASTER"
$CRM_MASTER -v $CRM_MASTER_SCORE
exit $OCF_RUNNING_MASTER
else
logger "redis_monitor: CRM says master but
redis says other thing"
$CRM_MASTER -D
exit $OCF_FAILED_MASTER
fi
fi

# CRM says redis is slave:
if [ "${CURSTATE}" = "slave" ];then
if [ "${state}" = "slave" ];then
logger "redis_monitor 2 $OCF_SUCCESS"
# TODO - w przyszlosci dodatkowe testy np.
zapis odczy klucza/sprawdzenie czy replikacja dziala itp.
$CRM_MASTER -v $CRM_SLAVE_SCORE
exit $OCF_SUCCESS
else
logger "redis_monitor: CRM says slave but redis
says other thing"
$CRM_MASTER -D
exit $OCF_NOT_RUNNING
fi

fi

# State not defined (not in master-slave state)
if [ "${CURSTATE}" = "" ];then
if [ "${state}" = "" ];then
logger "redis_monitor pre-end $OCF_NOT_RUNNING"
$CRM_MASTER -D
exit $OCF_NOT_RUNNING
else
logger "redis_monitor pre-end $OCF_SUCCESS"
$CRM_MASTER -v $CRM_SLAVE_SCORE
exit $OCF_SUCCESS
fi
fi

# It's impossible to get here but safe to keep it
$CRM_MASTER -D
logger "redis_monitor end $OCF_NOT_RUNNING"
exit $OCF_NOT_RUNNING
}


>
>>
>> 2) Using older version of redis 2.3 If master failed redis is hanging for
>> some time (21-24 seconds). Even I set higher timeout on monitor functions it
>> still timeout after 20 seconds why?.
>
> How did you set the timeout higher?
>
By setting:
default-action-timeout="60s"

I think that monitor timeout should be sufficient but operation was
stopped afeter. Error like that:
Apr 26 16:25:37 SREVERXXX lrmd: [18777]: debug: on_msg_perform_op: add
an operation operation monitor[3] on ocf::redis::redis-serv:0 for client
18780, its parameters: vservers=[redis-2,redis-1]
CRM_meta_master_max=[1] CRM_meta_timeout=[20000] CRM_meta_clone_max=[2]
CRM_meta_master_node_max=[1] crm_feature_set=[3.0.1]
CRM_meta_globally_unique=[false] masterip=[X.X.X.X] CRM_meta_clone=[0]
CRM_meta_clone_node_max=[1] CRM_meta_notify=[false] to the operation list.



Another question is that: What value should demote function return if
node (master) is down. I return OCF_NOT_RUNNING and get this failed:
Failed actions:
redis-server:0_demote_0 (node=s1, call=61, rc=7, status=complete):
not running




--
Greg
Attachments: report2.tar.bz2 (63.8 KB)


dejanmm at fastmail

May 9, 2012, 4:57 AM

Post #5 of 8 (1113 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

Hi,

On Wed, Apr 25, 2012 at 10:41:05PM +0200, Greg wrote:
>
> Hi,
>
> I try to write redis resources agent working in master-slave. My

Are you aware of a pull request for one redis resource agent:

https://github.com/ClusterLabs/resource-agents/pull/37

It's been there a while, blocked mainly because it uses debian
specific daemon start/stop machinery.

Thanks,

Dejan

> configuration:
> node s1
> node s2
> primitive ip-redis ocf:heartbeat:IPaddr2 \
> params ip="192.168.1.15" nic="eth0" cidr_netmask="24" \
> op monitor interval="10s" timeout="30s" \
> meta target-role="Started"
> primitive redis-server ocf:implix:redis4 \
> op start interval="0" timeout="60s" \
> op stop interval="0" timeout="60s" \
> op monitor interval="5s" role="Master" timeout="60s" \
> op monitor interval="10s" role="Slave" timeout="60s" \
> params masterip="192.168.1.15"
> ms redis-ms redis-server \
> meta master-max="1" master-node-max="1" clone-max="2" \
> clone-node-max="1" target-role="Master"
> colocation co-redis-ms inf: ip-redis redis-ms:Master
> order or-redis inf: redis-ms:promote ip-redis:start
> property $id="cib-bootstrap-options" \
> dc-version="1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c" \
> cluster-infrastructure="openais" \
> no-quorum-policy="ignore" \
> stonith-enabled="false" \
> expected-quorum-votes="2" \
> default-action-timeout="20s" \
> last-lrm-refresh="1335271825" \
> default-resource-stickiness="10"
>
>
> To simplify RA all redis nodes start as a slave (that's why I need
> to pass masterip in configuration).
>
> Script works great it promote on secondary (if master node is down)
> but only few times. In some point sometimes after 2 or after 3
> master fails (manually kill process) I get this error:
> redis-server:0_monitor_5000 (node=s1, call=16, rc=9,
> status=complete): master (failed)
>
> My mointor function (simplified and removed overhead and added some
> comments) is:
> redis_monitor() {
> # I set score 10 for master 5 is for slave
> CURSCORE=`$CRM_MASTER -G -q`
> logger "redis_monitor: score $CURSCORE"
> local state
> redis_state
>
> # In RET is current local redis state
> state=$(echo "${RET}" | cut -d':' -f2 | tr -d '\r')
>
> if [ "${state}" = "master" ];then
> $CRM_MASTER -v $CRM_MASTER_SCORE # score is 10
> exit $OCF_RUNNING_MASTER
> fi
>
> if [ "${state}" = "slave" ];then
> $CRM_MASTER -v $CRM_SLAVE_SCORE # score is 5
> exit $OCF_SUCCESS
> fi
>
> # if not slave/master so resource is failed
> $CRM_MASTER -l reboot -D
> if [ $CURSCORE -eq $CRM_MASTER_SCORE ];then
> exit $OCF_FAILED_MASTER
> fi
>
> exit $OCF_NOT_RUNNING
> }
>
> From my logs I know that monitoring function returned
> OCF_FAILED_MASTER when master is down and then this error occurred:
> redis-server:0_monitor_5000 (node=s1, call=16, rc=9,
> status=complete): master (failed)
>
> After that failed master node is not monitored on that node until I
> run cleanup:
> #crm resource cleanup redis-server:0
>
>
> My questions:
> 1) What I'm doing wrong ?. How can I fix this.
> I've tried on-fail="restart" but this not helped
>
> 2) Using older version of redis 2.3 If master failed redis is
> hanging for some time (21-24 seconds). Even I set higher timeout on
> monitor functions it still timeout after 20 seconds why?.
> (Changing default-action-timeout to higher value helped to resolve
> this but I think timeout should be enough)
>
>
>
> --
> Greg
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


test123 at implix

May 14, 2012, 4:01 AM

Post #6 of 8 (1084 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

W dniu 05/09/12 13:57, Dejan Muhamedagic pisze:
> Hi,
>
> On Wed, Apr 25, 2012 at 10:41:05PM +0200, Greg wrote:
>>
>> Hi,
>>
>> I try to write redis resources agent working in master-slave. My
>
> Are you aware of a pull request for one redis resource agent:
>
> https://github.com/ClusterLabs/resource-agents/pull/37
>
> It's been there a while, blocked mainly because it uses debian
> specific daemon start/stop machinery.
>
> Thanks,
>
> Dejan
>
[cut]

Thx for pointing me on this script. My environment is slightly different
so I can't use this script without modification. I've notice that your
script in this scenario :
- master is promoted and goes down (for example OOM, admin mistake etc.)

Your script will reply: OCF_ERR_GENERIC

My script wil reply: OCF_FAILED_MASTER
(from manual this is best option but it's not working the way I've
expected because after recovery to slave, and in case of promotion to
master node is not monitored in this state anymore)

Can you tell me why have you chose answer OCF_ERR_GENERIC ?


Thx

--
Grego


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


test123 at implix

May 14, 2012, 4:02 AM

Post #7 of 8 (1085 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

W dniu 05/09/12 13:57, Dejan Muhamedagic pisze:
> Hi,
>
> On Wed, Apr 25, 2012 at 10:41:05PM +0200, Greg wrote:
>>
>> Hi,
>>
>> I try to write redis resources agent working in master-slave. My
>
> Are you aware of a pull request for one redis resource agent:
>
> https://github.com/ClusterLabs/resource-agents/pull/37
>
> It's been there a while, blocked mainly because it uses debian
> specific daemon start/stop machinery.
>
> Thanks,
>
> Dejan
>
[cut]

Thx for pointing me on this script. My environment is slightly different
so I can't use this script without modification. I've notice that your
script in this scenario :
- master is promoted and goes down (for example OOM, admin mistake etc.)

Your script will reply: OCF_ERR_GENERIC

My script wil reply: OCF_FAILED_MASTER
(from manual this is best option but it's not working the way I've
expected because after recovery to slave, and in case of promotion to
master node is not monitored in this state anymore)


Can you tell me why have you chose answer OCF_ERR_GENERIC ?


Thx

--
Greg


_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org


dejanmm at fastmail

May 14, 2012, 8:04 AM

Post #8 of 8 (1084 views)
Permalink
Re: Why monitor fails in my RA [In reply to]

On Mon, May 14, 2012 at 01:01:48PM +0200, Greg wrote:
> W dniu 05/09/12 13:57, Dejan Muhamedagic pisze:
>> Hi,
>>
>> On Wed, Apr 25, 2012 at 10:41:05PM +0200, Greg wrote:
>>>
>>> Hi,
>>>
>>> I try to write redis resources agent working in master-slave. My
>>
>> Are you aware of a pull request for one redis resource agent:
>>
>> https://github.com/ClusterLabs/resource-agents/pull/37
>>
>> It's been there a while, blocked mainly because it uses debian
>> specific daemon start/stop machinery.
>>
>> Thanks,
>>
>> Dejan
>>
> [cut]
>
> Thx for pointing me on this script. My environment is slightly different
> so I can't use this script without modification.

This premise doesn't sound right. An RA should be generic enough
to fit in any environment. Having said that, I'm not implying
that either of these RAs is wrong, but that possibly one or the
other are lacking in flexibility.

> I've notice that your
> script in this scenario :
> - master is promoted and goes down (for example OOM, admin mistake etc.)
>
> Your script will reply: OCF_ERR_GENERIC
>
> My script wil reply: OCF_FAILED_MASTER
> (from manual this is best option but it's not working the way I've
> expected because after recovery to slave, and in case of promotion to
> master node is not monitored in this state anymore)
>
> Can you tell me why have you chose answer OCF_ERR_GENERIC ?

Not my script. Somebody wants to contribute it to our collection
of resource agents. You can ask questions at github, I suppose
that the author is still around.

Thanks,

Dejan

> Thx
>
> --
> Grego
>
>
> _______________________________________________
> Pacemaker mailing list: Pacemaker [at] oss
> http://oss.clusterlabs.org/mailman/listinfo/pacemaker
>
> Project Home: http://www.clusterlabs.org
> Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
> Bugs: http://bugs.clusterlabs.org

_______________________________________________
Pacemaker mailing list: Pacemaker [at] oss
http://oss.clusterlabs.org/mailman/listinfo/pacemaker

Project Home: http://www.clusterlabs.org
Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf
Bugs: http://bugs.clusterlabs.org

Linux-HA pacemaker RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.