
jshirley at gmail
Oct 11, 2008, 1:03 PM
Post #8 of 10
(1194 views)
Permalink
|
On Sat, Oct 11, 2008 at 12:31 PM, John Goulah <jgoulah[at]gmail.com> wrote: > On Fri, Oct 10, 2008 at 6:17 PM, J. Shirley <jshirley[at]gmail.com> wrote: >> On Fri, Oct 10, 2008 at 2:22 PM, Ash Berlin <ash_cpan[at]firemirror.com> wrote: >>> >>> On 10 Oct 2008, at 20:46, J. Shirley wrote: >>> >>>> >>>> >>>> I really really really don't want to start a (very much so offtopic) >>>> flamewar, but I would like to get a discussion going about this versus >>>> TheSchwartz. It seems roughly similar (at least in function). >>> >>> TBH one of the reasons I avoided TheSchwartz was that I couldn't work out >>> what was going on. I did feel kinda iffy about wheel re-invention here, but >>> there was something about TheSchwartz when i looked at that didn't sit well >>> with me. Can't remember what it was anymore. >>> >>>> >>>> >>>> Here are the features that TheSchwartz has that I didn't see in >>>> MooseX::JobQueue (and yes, please name it something other than >>>> MooseX::JobQueue) >>>> >>>> The following are handled because of Data::ObjectDriver, but want to >>>> list them as features anyway: >>>> 1. Partitioning of jobs in the database >>>> 2. Built-in replication handling >>> >>> Not really sure what these two things are? Shouldn't replication be done at >>> a DB level? Partitioning - as having jobs live in two different tables/DBs? >>> If so then App::JobQueue (lets call it that for lack of a better >>> alternative) does that. >>> >> >> Well, I mean horizontal partitioning. So, automatic partitioning >> based on some algorithm (like "if job->id % 2 => use this cluster"). >> >> I didn't realize it did that... couldn't find that bit. >> >> As far as replication goes, DBIC handles some replication schemes but >> there isn't the same support that D::OD has. I'm not championing >> D::OD at all here, I prefer DBIC for all things; however D::OD has a >> lot of code to support multiplexing and caching that DBIC hasn't >> culled yet. >> >> So, while replication happens at the database layer, the interactions >> there require client side behaviors. Such as reading from slaves, >> write to masters, etc. DBIC already has basic slave/master support >> but without support for slave read-delay (which is unfortunately >> application specific in most cases) App::JobQueue won't have that... >> >> Which means worse replication support than TheSchwartz. > > > This is correct. When it was put in production on several servers > under a replicated DBIC things went a bit haywire with the job locking > I believe when slaves got delayed and we had to point all queries at > the master. Otherwise it does scale beautifully to multiple machines. > I wonder what the best solution is here. > > John > I've spent a great deal of time thinking about it in the past and the best solution I ever came up with was wrapping it in transactions when you do a write and need to read the up-to-date information (meaning that in a transaction, the read source is always the write source, period.) It does restrict some flexibility in the application, but I believe that it is worth it for a few reasons. Mostly, it keeps the application structure sane (and also thins controllers naturally). You can put an intermediate "caching" layer (or, rather, data access) that gets updated in a single API, so you have better testability. It ends up being slightly more code, which is slightly slower, but it scales near-linearly that way. In the context of a job queue, the slave needs to access the most up-to-date information on the job status (to make sure that there isn't competition) so there will always be a read on the master to determine the job state. After that, to query any other information you could query a slave and disregard any read-delay, since in theory once the job is assigned to a worker, it shouldn't be written to except by that worker (or the master that marks the worker as stalled/dead). One other problem I ran into with TheSchwartz is that the job execution time would occasionally hang, triggering jobs that stack on top of each other. So, sending a SIG to notify the working child that their execution time is up would be very nice. That way it can back out/stop working, and exit gracefully rather than have two competing workers on the same resource (just thinking of parallelization cases for master/slave scaling) -J _______________________________________________ Catalyst-dev mailing list Catalyst-dev[at]lists.scsys.co.uk http://lists.scsys.co.uk/cgi-bin/mailman/listinfo/catalyst-dev
|