
flazzarini at gmail
Nov 25, 2010, 12:52 AM
Post #7 of 9
(879 views)
Permalink
|
Hi Florian, I will have a look into this today, and fix these things. Thx for all your comments. Some comments inline... On 25.11.2010 09:31, Florian Haas wrote: > On 2010-11-24 19:46, Frank Lazzarini wrote: >> Hi, >> >> sure no problem we can name it firebird. > OK, firebird it is. > >> Packaging it with the firebird >> packages I don't really find a good idea, because not every user that >> needs firebird 2.5 will use it in a cluster. > That's why there would typically be a subpackage RPM named > firebird-resource-agent or whatever, and no-one would be obliged to > install that. > Yes this would be an idea but I would really appreciate beeing in the cluster-agents package. >> Therefore I would >> appreciate it if it would be packages with the pacemaker package, if >> that package includes all the ocf RA's. > It's not pacemaker, it's the resource-agents (on Debian: cluster-agents) > package, but sure, we can ship it. > >> Thanks again for the support. > A few more issues I noticed: > > 1. you use sudo, any particular reason why you're not doing "su -c" like > all the other RAs do? Besides, you should add a check_binary test (for > either sudo or su) to your validate function. > > I had some problems using "su -c" but I think it was because the pidfile couldn't be written, the fbguard process isn't that clever on giving out some error messages it just exits. Gonna go and change that to su, or like you say check wether su or sudo exists. > 2. Even if starting the daemon fails, you still pretend it succeeded: > >> # start fbguard with the specified user >> sudo -u $OCF_RESKEY_user $OCF_RESKEY_binary -pidfile $OCF_RESKEY_pid -daemon -forever >> >> return $OCF_SUCCESS > You can either exit on error after sudo (with "|| exit > $OCF_ERR_GENERIC"), or spin on monitor after start, or both. > > > 3. This too looks fishy: > >> if [ -f $OCF_RESKEY_pid ] >> then >> kill `cat $OCF_RESKEY_pid` >> if [ $? = 0 ]; then >> return $OCF_SUCCESS >> fi > All that $? tells you is whether you were able to _send_ the signal. It > says nothing about whether the process did the right thing with the > signal, and it _certainly_ does not imply that the process shut down > completely. Returning successfully here is premature. Ditch that if > block. Because just below: > >> fi >> >> while firebird25_monitor; do >> ocf_log debug "Resource has not stopped yet, waiting" >> sleep 1 >> done > ... you're doing the right thing and ask your monitor function to see if > the daemon is in fact dead (and if it's not, you wait till it is). > > > 4. In monitor, you do this: > >> pid=`cat $OCF_RESKEY_pid` >> kill -s 0 $pid>/dev/null 2>&1 >> rc=$? > ... and then test based on $rc. Now, kill -s 0 is generally fine to test > whether a process with that pid is currently running, but the problem > with that approach is that it _will_ succeed on a process that is a > zombie, which really is a failed resource. Thus, it's often a good idea > to combine this with an additional test, preferably one where you can > ask the daemon, or a client, whether the application is really > operational. Does Firebird offer such a facility? > > I am not really sure about this, the only thing I can think of is something similar than the mysql resource agent does, which is doing a sql on some Database system tables, but Firebird doesn't really offer that possibility as those tables only exist in the databases it self, so if there is no database currently attached to the firebird server I wouldn't have these system tables. But I will have a look into this, and maybe figure out a way to check this. > 5. There's an easier way to do this: > >> if [ ! -x $OCF_RESKEY_binary ]; then >> ocf_log err "fbguard binary $OCF_RESKEY_binary does not exist or is not executable" >> exit $OCF_ERR_INSTALLED >> fi > You just replace those 4 lines with "check_binary $OCF_RESKEY_binary". > Alrighty didn't know of that function. Hmm this just gives me an idea, remember I use a function to check wether the pid directory is writable or not, wouldn't that be an idea to add to .ocf-binaries ? Like I say just an idea. > 6. Also, >> getent passwd $OCF_RESKEY_user>/dev/null 2>&1 >> if [ ! $? -eq 0 ]; then > can just be replaced with > if ! getent passwd $OCF_RESKEY_user>/dev/null 2>&1; then > > ...for additional brevity. > > As far as I'm concerned, once you submit a version with these issues > fixed, this RA is in good shape to be merged. Lars, Dejan, and the other > usual suspects, if you have any objections to pushing this, please let > me know. > I will get to it, this afternoon to correct these things :), once again thx for all your comments. > Cheers, > Florian > > > > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev [at] lists > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ Frank
|