
sergeyfd at gmail
Feb 26, 2007, 10:24 AM
Post #17 of 17
(1167 views)
Permalink
|
There were some more problems besides that initialization. Attached is a patch. I tested it and it seems to work fine. On 2/26/07, Andrew Beekhof <beekhof[at]gmail.com> wrote: > On 2/26/07, Serge Dubrouski <sergeyfd[at]gmail.com> wrote: > > You broke it: > > > > ./pgsql start > > Usage: grep [OPTION]... PATTERN [FILE]... > > Try `grep --help' for more information. > > chown: missing operand after `:' > > Try `chown --help' for more information. > > 2007/02/26_12:50:26 ERROR: Can't start PostgreSQL. > > > > The reason for these errors is changed way of initialization > > sorry - i've pushed up a fix > > > variables. Also I still don't like that indefinite loop on start > > because it makes harder to manually troubleshoot problem in case if > > PostgreSQL doesn't start. > > then add a call to ocf_log which indicates the RA is retrying or some-such > > the RA is definitely not the best place to set limits on how long a > resource can take to start. > > at the very least it leads to confusion when the timeout is less than > an RAs internal limit. on the other-hand, if the internal limit is > lower than the timeout, then you're returning before you needed to. > > it is also not reliable if any part of the RA can block. > > > I don't know what is the right way to fix those problem now: fix your > > version of script or fix previous one. > > > > On 2/26/07, Andrew Beekhof <beekhof[at]gmail.com> wrote: > > > i made some further improvements in: > > > http://hg.beekhof.net/lha/crm-dev/rev/2e9b22cfb7e1 > > > > > > On 2/26/07, Keisuke MORI <kskmori[at]intellilink.co.jp> wrote: > > > > "Serge Dubrouski" <sergeyfd[at]gmail.com> writes: > > > > >> "Serge Dubrouski" <sergeyfd[at]gmail.com> writes: > > > > >> > > > > >> > And I don't like the idea of removing PID in "start" function. The > > > > >> > standard approach if to remove it after stopping application. Other > > > > >> > way it could lead to attempt of starting a second copy of application. > > > > >> > > > > >> This is necessary for the recovery from the power failure of the > > > > >> primary node, for example. There is no chance to cleanup by stop > > > > >> in such cases. > > > > >> > > > > >> Duplicate starting is avoided by checking if the postmaster > > > > >> process exists beforehand, as the original script does. > > > > > > > > > > Yes, but in this case you remov the legitimate pid file from the > > > > > running instance. You remove it before testing that the checking for > > > > > postmaster. > > > > > > > > Well, I think that the script does the cheking for postmaster first > > > > and removing it second (remove it only when no postmaster process exists). > > > > > > > > Here's the code snip with my patch. > > > > pgsql_status checks for it and I think it should be good enough. > > > > ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<---- > > > > pgsql_start() { > > > > if pgsql_status > > > > then > > > > ocf_log info "PostgreSQL is already running. PID=`cat $PIDFILE`" > > > > return $OCF_SUCCESS > > > > fi > > > > > > > > if [ -x $PGCTL ] > > > > then > > > > # Remove postmastre.pid if it exists > > > > rm -f $PIDFILE > > > > ----8<--------8<--------8<--------8<--------8<--------8<--------8<--------8<---- > > > > > > > > > > > > > Let me think about it, I don't know what is worse in a > > > > > such case. Probably you are right and we has the right to think that > > > > > Postgress shouldn't be started outside of cluster control. > > > > > > > > If postmaster was already started outside of heartbeat control, > > > > then it should return OCF_SUCCESS and the postmaster should > > > > continue to run. > > > > > > > > Power failure is one of the most typical situation that we want > > > > to save with HA software, so this 'cleanup in start' is > > > > important, I think. > > > > > > > > Maybe it would be nice if we put a WARN log before removing it. > > > > > > > > Thanks, > > > > > > > > > > > > > >> > > > > >> > > > > >> > > > > > >> > On 2/23/07, Serge Dubrouski <sergeyfd[at]gmail.com> wrote: > > > > >> >> I like the idea of the patch, but honestly I don't like how it's > > > > >> >> implemented. It shall call (as Andrew suggested) "monitor" function to > > > > >> >> check that pgsql is up or down instead of spreading the same code all > > > > >> >> around the script. I'd like to review the idea and prepare another > > > > >> >> patch if everybody is agree. > > > > >> > > > > >> Yes, using the same monitor function would be better. > > > > >> I didn't do that just because it will dump many logs every > > > > >> seconds when it takes time to start. > > > > >> It is OK if you don't mind it. > > > > > > > > > > Don't think that this is a problem. Those files are big even without > > > > > those records. > > > > > > > > > > Thanks for all these proposals. > > > > > > > > > >> > > > > >> Thanks, > > > > >> -- > > > > >> Keisuke MORI > > > > >> NTT DATA Intellilink Corporation > > > > >> _______________________________________________________ > > > > >> Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org > > > > >> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > > > >> Home Page: http://linux-ha.org/ > > > > >> > > > > > _______________________________________________________ > > > > > Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org > > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > > > > Home Page: http://linux-ha.org/ > > > > > > > > -- > > > > Keisuke MORI > > > > Open Source Business Division > > > > NTT DATA Intellilink Corporation > > > > Tel: +81-3-3534-4811 / Fax: +81-3-3534-4814 > > > > _______________________________________________________ > > > > Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org > > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > > > Home Page: http://linux-ha.org/ > > > > > > > _______________________________________________________ > > > Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org > > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > > Home Page: http://linux-ha.org/ > > > > > _______________________________________________________ > > Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org > > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > > Home Page: http://linux-ha.org/ > > > _______________________________________________________ > Linux-HA-Dev: Linux-HA-Dev[at]lists.linux-ha.org > http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev > Home Page: http://linux-ha.org/ >
|