Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: Linux-HA: Dev

Re: DB2 agent patch 2/3: Guard against hanging db2stop

 

 

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded


dejanmm at fastmail

Nov 4, 2010, 8:16 AM

Post #1 of 2 (236 views)
Permalink
Re: DB2 agent patch 2/3: Guard against hanging db2stop

Hi,

On Thu, Nov 04, 2010 at 09:31:04AM +0100, Holger.Teutsch [at] web wrote:
> # HG changeset patch
> # User Holger Teutsch <holger.teutsch [at] web>
> # Date 1288857475 -3600
> # Node ID 2ff375ca321554cf146bcf5be197f73fcbe28975
> # Parent 554ebfef6e9513178ea04cc4093710b65311934a
> Guard against a hanging db2stop by spawning this into the background. Use db2_kill after grace period.
>
> diff -r 554ebfef6e95 -r 2ff375ca3215 heartbeat/db2
> --- a/heartbeat/db2 Thu Nov 04 08:53:37 2010 +0100
> +++ b/heartbeat/db2 Thu Nov 04 08:57:55 2010 +0100
> @@ -211,16 +211,11 @@ db2_start() {
> done
> }
>
> -#
> -# db2_stop: Stop the given db2 database instance
> -#
> -db2_stop() {
> - # We ignore the instance, the info we need is already in $vars
> +# helper function in a spawned invocation of this script
> +# so we can detect a hang of the db2stop command
> +db2_stop_bg() {
> rc=$OCF_SUCCESS
> - db2_status || {
> - ocf_log info "DB2 UDB instance $1 already stopped"
> - return $rc
> - }
> +
> if
> output=`runasdb2 $db2adm/db2stop force`
> then
> @@ -236,17 +231,89 @@ db2_stop() {
> rc=$OCF_ERR_GENERIC;;
> esac
> fi
> - logasdb2 $db2db2 terminate
> - if [ -x $db2bin/db2_kill ]; then
> - logasdb2 $db2bin/db2_kill
> - elif [ -x $db2bin/db2nkill ]; then
> - logasdb2 $db2bin/db2nkill $db2node
> +
> + return $rc
> +}
> +
> +#
> +# db2_stop: Stop the given db2 database instance
> +#
> +db2_stop() {
> + # We ignore the instance, the info we need is already in $vars
> +
> + rc=$OCF_SUCCESS
> +
> + db2_status || {
> + ocf_log info "DB2 UDB instance $1 already stopped"
> + return $rc
> + }
> +
> + if [ -n "$OCF_RESKEY_stop_timeout" ]
> + then
> + stop_timeout=$OCF_RESKEY_stop_timeout
> + elif [ -n "$OCF_RESKEY_CRM_meta_timeout" ]; then
> + stop_timeout=$OCF_RESKEY_CRM_meta_timeout
> + else
> + stop_timeout=20000
> fi
> +
> + # grace_time is 4/5 (unit is ms)
> + grace_timeout=$((stop_timeout/1250))
> +
> + # start db2stop in background as this may hang
> + sh $0 db2_stop_bg &

This should be OK:

db2_stop_bg &

> + stop_bg_pid=$!
> +
> + # wait for grace_timeout
> + i=0
> + while [ $i -lt $grace_timeout ]
> + do
> + kill -0 $stop_bg_pid 2>/dev/null || break;
> + sleep 1
> + i=$((i+1))
> + done
> +
> + # collect exit status but don't hang
> + if kill -0 $stop_bg_pid 2>/dev/null
> + then
> + stoprc=1
> + kill -9 $stop_bg_pid 2>/dev/null
> + else
> + wait $stop_bg_pid
> + stoprc=$?
> + fi
> +
> + if [ $stoprc -ne 0 ]
> + then
> + ocf_log warn "db2stop of $instance failed, using db2nkill"
> +
> + # db2nkill kills *all* partions on the node
> + if [ -x $db2bin/db2nkill ]; then
> + logasdb2 $db2bin/db2nkill $db2node
> + elif [ -x $db2bin/db2_kill ]; then
> + logasdb2 $db2bin/db2_kill
> + fi
> +
> + # let the processes die
> + sleep 2
> +
> + if db2_status
> + then
> + ocf_log info "DB2 UDB instance $1 can not be killed with db2nkill"
> + rc=$OCF_ERR_GENERIC
> + else
> + ocf_log info "DB2 UDB instance $1 is now dead"
> + fi

Perhaps safer to wait in a loop until the processes are gone:

sleep 1
while db2_status; do
ocf_log info "waiting for DB2 UDB instance $1 processes to exit"
sleep 1
done
ocf_log info "DB2 UDB instance $1 is now dead"

Cheers,

Dejan

> + fi
> + # db2jd has been deprecated since v8.x and doesn't exist
> + # anymore in v9.x
> pids=`our_db2_ps | grep db2jd | cut -d' ' -f1`
> for j in $pids
> do
> runasdb2 kill -9 $j
> done
> +
> return $rc
> }
>
> @@ -373,6 +440,9 @@ case "$1" in
> stop) db2_stop $instance
> exit $?;;
>
> + db2_stop_bg) db2_stop_bg $instance
> + exit $?;;
> +
> status) if
> db2_status $instance
> then
> ___________________________________________________________
> GRATIS! Movie-FLAT mit über 300 Videos.
> Jetzt freischalten unter http://movieflat.web.de
> _______________________________________________________
> Linux-HA-Dev: Linux-HA-Dev [at] lists
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev [at] lists
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/


Holger.Teutsch at web

Nov 5, 2010, 2:05 AM

Post #2 of 2 (226 views)
Permalink
Re: DB2 agent patch 2/3: Guard against hanging db2stop [In reply to]

-----Ursprüngliche Nachricht-----
Von: "Dejan Muhamedagic" <dejanmm [at] fastmail>
Gesendet: Nov 4, 2010 4:16:09 PM
An: "High-Availability Linux Development List" <linux-ha-dev [at] lists>
Betreff: Re: [Linux-ha-dev] DB2 agent patch 2/3: Guard against hanging db2stop

>Hi,
>
>On Thu, Nov 04, 2010 at 09:31:04AM +0100, Holger.Teutsch [at] web wrote:
>> # HG changeset patch
>> # User Holger Teutsch <holger.teutsch [at] web>
>> # Date 1288857475 -3600
>> # Node ID 2ff375ca321554cf146bcf5be197f73fcbe28975
>> # Parent 554ebfef6e9513178ea04cc4093710b65311934a
>> Guard against a hanging db2stop by spawning this into the background. Use db2_kill after grace period.
>>
>> diff -r 554ebfef6e95 -r 2ff375ca3215 heartbeat/db2
>> --- a/heartbeat/db2 Thu Nov 04 08:53:37 2010 +0100
>> +++ b/heartbeat/db2 Thu Nov 04 08:57:55 2010 +0100
>> @@ -211,16 +211,11 @@ db2_start() {
...
>> + grace_timeout=$((stop_timeout/1250))
>> +
>> + # start db2stop in background as this may hang
>> + sh $0 db2_stop_bg &
>
>This should be OK:
>
> db2_stop_bg &

Done.

>

...
>> + # let the processes die
>> + sleep 2
>> +
>> + if db2_status
>> + then
>> + ocf_log info "DB2 UDB instance $1 can not be killed with db2nkill"
>> + rc=$OCF_ERR_GENERIC
>> + else
>> + ocf_log info "DB2 UDB instance $1 is now dead"
>> + fi
>
>Perhaps safer to wait in a loop until the processes are gone:
>
> sleep 1
> while db2_status; do
> ocf_log info "waiting for DB2 UDB instance $1 processes to exit"
> sleep 1
> done
> ocf_log info "DB2 UDB instance $1 is now dead"
>

I put in something similar but more specific.

>Cheers,
>
>Dejan
>


Once we are through with this I will rebase the multipartition patch.
Regards
Holger
___________________________________________________________
GRATIS! Movie-FLAT mit über 300 Videos.
Jetzt freischalten unter http://movieflat.web.de
Attachments: db2.2 (3.76 KB)

Linux-HA dev RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.