
nakahira at intellilink
Nov 16, 2009, 1:15 AM
Post #16 of 22
(2955 views)
Permalink
|
Hi Dejan, Dejan Muhamedagic wrote: > Hi Kazutomo-san, > > On Fri, Nov 13, 2009 at 07:14:18PM +0900, NAKAHIRA Kazutomo wrote: >> Hi, Dejan and Raoul >> >> I hope you will forgive me for being so slow to answer. >> # I have some other works and it takes time. > > That's fine. > > [...] >>> When sourcing ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs >>> you should use '.' not '..'. >> I wrote '.' as a source command in the original file. >> But it translated to '..' automatically by the my mailer >> or our company's mail server.(It is so strange) > > Most funny. Attachment shouldn't be messed with really. > >> Please revise '..' to '.' if it translated again. > > OK. > >>> In the stop procedure, you use the QUIT signal. That's going to >>> produce a coredump of the process. Is that actually intended? Why >>> not use KILL after TERM? >> Using "kill -QUIT" is actually intended in this RA. >> QUIT signal for JVM process dose not stop target process, but > > What does JVM has to do with syslog? > >> QUIT signal for common linux process stop target process. > > From signal(7): > > SIGQUIT 3 Core Quit from keyboard > > i.e. there could be core dumps and I'm not sure if that's > what you intend. > What I intend is that if syslog-ng process never stops by the "kill -TERM", then trying core dump and stop process by the "kill -QUIT". >> The syslog-ng RA's stop sequence is below. >> 1. Execute "kill -TERM" and wait KILL_TERM_TIMEOUT seconds >> until syslog-ng porcess stopped. >> 2. If sylog-ng process dose not stopped, then Execute >> "kill -QUIT" KILL_QUIT_TIMEOUT times at intervals of >> 1 second until syslog-ng porcess stopped. >> 3. If syslog-ng process still alive, then Execete >> "kill -KILL" at intervals of 1 second >> until syslog-ng porcess stopped. > > One KILL should be enough as that is actually not delivered to > the process at all, but the process gets removed from the system, > unless it's in the D state, i.e. waiting for some device. But > that's not really important. > That's for sure. Retrying "kill -QUIT" is redundant and KILL_QUIT_TIMEOUT too. I revised these parts and the syslog-ng RA's stop sequence is below. 1. Execute "kill -TERM" and wait KILL_TERM_TIMEOUT seconds until syslog-ng porcess stopped. 2. If sylog-ng process dose not stopped, then Execute "kill -QUIT" to dump core and stop process. 3. If syslog-ng process still alive, then Execete "kill -KILL" at intervals of 1 second until syslog-ng porcess stopped. >>> On formatting: sometimes spaces are used and sometimes tabs for >>> indentation. Can you please use either one or the other >>> (preferably the latter). >> I agree. I substituted all indentation spaces to tabs. >> >> A re-revised syslog-ng RA is attached. >> >> Best Regards, >> NAKAHIRA Kazutomo >> >> Dejan Muhamedagic wrote: >>> Hi, >>> >>> On Tue, Nov 10, 2009 at 01:02:06PM +0100, Raoul Bhatia [IPAX] wrote: >>>> On 09/21/2009 01:59 PM, Dejan Muhamedagic wrote: >>>>> Hi Kazutomo-san, >>>>> >>>>> On Fri, Sep 18, 2009 at 05:19:28PM +0900, NAKAHIRA Kazutomo wrote: >>>>>> Hi, Dejan >>>>>> >>>>>> I'm sorry I didn't get back to you sooner as a JBoss RA. >>>>>> I took over mori-san and takenaka-san's work. >>>>>> >>>>>> I revised a syslog-ng RA referring to your comments. >>>>>> The modification and my comments is written in the attached RA. >>>>> When sourcing ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs >>>>> you should use '.' not '..'. >>>>> >>>>> In the stop procedure, you use the QUIT signal. That's going to >>>>> produce a coredump of the process. Is that actually intended? Why >>>>> not use KILL after TERM? >>>>> >>>>> On formatting: sometimes spaces are used and sometimes tabs for >>>>> indentation. Can you please use either one or the other >>>>> (preferably the latter). >>>> hi, >>>> >>>> what is the current status on this one? >>> Apparently waiting for some response from Kazutomo-san. >>> >>> Thanks, >>> >>> Dejan >>> >>>> cheers, >>>> raoul >>>> -- >>>> ____________________________________________________________________ >>>> DI (FH) Raoul Bhatia M.Sc. email. r.bhatia [at] ipax >>>> Technischer Leiter >>>> >>>> IPAX - Aloy Bhatia Hava OEG web. http://www.ipax.at >>>> Barawitzkagasse 10/2/2/11 email. office [at] ipax >>>> 1190 Wien tel. +43 1 3670030 >>>> FN 277995t HG Wien fax. +43 1 3670030 15 >>>> ____________________________________________________________________ >>>> _______________________________________________________ >>>> Linux-HA-Dev: Linux-HA-Dev [at] lists >>>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>>> Home Page: http://linux-ha.org/ >>> _______________________________________________________ >>> Linux-HA-Dev: Linux-HA-Dev [at] lists >>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev >>> Home Page: http://linux-ha.org/ >> >> -- >> ---------------------------------------- >> NAKAHIRA Kazutomo >> NTT DATA INTELLILINK CORPORATION >> Open Source Business Unit >> Software Services Integration Business Division > >> #!/bin/bash >> # >> # Description: Manages a syslog-ng instance, provided by NTT OSSC as an >> # OCF High-Availability resource under Heartbeat/LinuxHA control >> # >> # Copyright (c) 2009 NIPPON TELEGRAPH AND TELEPHONE CORPORATION >> # >> ############################################################################## >> # OCF parameters: >> # OCF_RESKEY_syslog_ng_binary : Path to syslog-ng binary. >> # Default is "/sbin/syslog-ng" >> # OCF_RESKEY_configfile : Configuration file >> # OCF_RESKEY_start_opts : Startup options >> # OCF_RESKEY_kill_term_timeout: Number of seconds to await to confirm a >> # normal stop method >> # OCF_RESKEY_kill_quit_timeout: Number of times to try forcible >> # stop methods >> # >> # Only OCF_RESKEY_configfile must be specified. Each of the rests >> # has its default value or refers OCF_RESKEY_configfile to make >> # its value when no explicit value is given. >> # >> # Further infomation for setup: >> # There are sample configurations at the end of this file. >> # >> ############################################################################### >> >> . ${OCF_ROOT}/resource.d/heartbeat/.ocf-shellfuncs >> >> usage() >> { >> cat <<-! >> usage: $0 action >> >> action: >> start : start a new syslog-ng instance >> >> stop : stop the running syslog-ng instance >> >> status : return the status of syslog-ng, run or down >> >> monitor : return TRUE if the syslog-ng appears to be working. >> >> meta-data : show meta data message >> >> validate-all: validate the instance parameters >> ! >> return $OCF_ERR_ARGS >> } >> >> metadata_syslog_ng() >> { >> cat <<END >> <?xml version="1.0"?> >> <!DOCTYPE resource-agent SYSTEM "ra-api-1.dtd"> >> <resource-agent name="syslog_ng"> >> <version>1.0</version> >> >> <longdesc lang="en"> >> This script manages a syslog-ng instance as an HA resource. >> </longdesc> >> <shortdesc lang="en">Syslog-ng resource agent</shortdesc> >> >> <parameters> >> >> <parameter name="syslog_ng_binary" unique="0"> >> <longdesc lang="en"> >> This parameter specifies syslog-ng's executable file. >> </longdesc> >> <shortdesc>Executable file</shortdesc> >> <content type="string" default=""/> >> </parameter> >> >> <parameter name="configfile" unique="0" required="1"> >> <longdesc lang="en"> >> This parameter specifies a configuration file >> for a syslog-ng instance managed by this RA. >> </longdesc> >> <shortdesc>Configuration file</shortdesc> >> <content type="string" default=""/> >> </parameter> >> >> <parameter name="start_opts" unique="0"> >> <longdesc lang="en"> >> This parameter specifies startup options for a >> syslog-ng instance managed by this RA. When no value is given, no startup >> options is used. Don't use option '-F'. It causes a stuck of a start action. >> </longdesc> >> <shortdesc>Start options</shortdesc> >> <content type="string" default=""/> >> </parameter> >> >> <parameter name="kill_term_timeout" unique="0"> >> <longdesc lang="en"> >> On a stop action, a normal stop method(pkill -TERM) is firstly used. >> And then the confirmation of its completion is waited for >> the specified seconds by this parameter. >> The default value is 10. >> </longdesc> >> <shortdesc>Number of seconds to await to confirm a normal stop method</shortdesc> >> <content type="integer" default="10"/> >> </parameter> >> >> <parameter name="kill_quit_timeout" unique="0"> >> <longdesc lang="en"> >> On a stop action, if a normal stop method ends up with a failure, >> more forcible methods are taken. These methods are repeated the >> specified numbers by this parameter. >> The default value is 10. >> If every normal or forcible stop methods run into a failure, >> the KILL signal is used as a final method to stop. >> </longdesc> >> <shortdesc>Number of times to try forcible stop methods</shortdesc> >> <content type="integer" default="10"/> >> </parameter> >> >> </parameters> >> >> <actions> >> <action name="start" timeout="60s" /> >> <action name="stop" timeout="120s" /> >> <action name="status" timeout="60" /> >> <action name="monitor" depth="0" timeout="30s" interval="10s" start-delay="0" /> > > Perhaps default interval to something like 60s? Don't know what's > the case in other RAs. Should probably be reviewed. The default monitor interval in major part of RAs is "10s" and syslog-ng RA also follows this value. Of course, this value should change according to the system monitoring requirement. >> <action name="meta-data" timeout="5s" /> >> <action name="validate-all" timeout="5"/> >> </actions> >> </resource-agent> >> END >> return $OCF_SUCCESS >> } >> >> monitor_syslog_ng() >> { >> set -- $(pgrep -f "$PROCESS_PATTERN" 2>/dev/null) >> case $# in >> 0) ocf_log debug "No syslog-ng process for $CONFIGFILE" >> return $OCF_NOT_RUNNING;; >> 1) return $OCF_SUCCESS;; >> esac >> ocf_log err "mutiple syslog-ng process for $CONFIGFILE" > > BTW, does syslog-ng fork to process requests? Perhaps it's not > necessary to treat this as an error condition. Note that on start > it almost certainly does fork (most daemons do), so under some > unfavourable conditions this code may fail. I agree. I revised this part as follows. If multiple syslog-ng process found in monitor, then output warning level log and return OCF_SUCCESS. > Cheers, > > Dejan > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev [at] lists > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ Best Regards, NAKAHIRA Kazutomo -- ---------------------------------------- NAKAHIRA Kazutomo NTT DATA INTELLILINK CORPORATION Open Source Business Unit Software Services Integration Business Division
|