
andrew at beekhof
Jul 3, 2012, 11:12 PM
Post #15 of 20
(883 views)
Permalink
|
|
Re: Call cib_query failed (-41): Remote node did not respond
[In reply to]
|
|
On Wed, Jul 4, 2012 at 10:06 AM, Brian J. Murrell <brian [at] interlinx> wrote: > On 12-07-03 04:26 PM, David Vossel wrote: >> >> This is not a definite. Perhaps you are experiencing this given the pacemaker version you are running > > Yes, that is absolutely possible and it certainly has been under > consideration throughout this process. I did also recognize however, > that I am running the latest stable (1.1.6) release and while I might be > able to experiment with with a development branch in the lab, I could > not use it in production. So while it would be an interesting > experiment, my primary goal had to be getting 1.1.6 to run stably. > >> and the torture test you are running with all those parallel commands, > > It is worth keeping in mind that all of those parallel commands are just > as parallel with the 4 node cluster as they are with the 8 (4 nodes > actively modifying the CIB + 4 completely idle nodes) and 16 node > clusters -- both of which failed. > > Just because I reduced the number of nodes doesn't mean that I reduced > the parallelism any. Yes. You did. You reduced the number of "check what state the resource is on every node" probes. > The commands being run on each node are not > serialized and are all launched in parallel on the 4 node cluster as > much as they were with the 16 node cluster. > > So strictly speaking, it doesn't seem that parallelism in the CIB > modifications are as much of a factor as simply the number of nodes in > the cluster, even when some (i.e. in the 8 node test I did) of the nodes > are entirely passive and not modifying the CIB at all. Now I'm getting annoyed. I keep explaining this is not true yet you keep repeating the above assertion. Please go back an re-read my previous answers (both here and off-list). Properly. I will be happy to clarify anything that is still unclear. > >> but I wouldn't go as far as to say pacemaker cannot scale to more than a handful of nodes. > > I'd totally welcome being shown the error of my ways. > >> I'm sure you know this, I just wanted to be explicit about this so there is no confusion caused by people who may use your example as a concrete metric. > > But of course. In my experiments, it was clear that the cib process > could peak a single core on my 12 core Xeons with just 4 nodes in the > cluster at times. > > Therefore it is also clear that some time down the road, assuming CPU is > the limiting factor here, it's quite easy to see how a faster CPU core, > or multithreading the cib would allow for better scaling, but my point > was simply at the current time, and again, assuming (since I don't know > for sure what the limiting factor really is) CPU is the limiting factor > here, somewhere between 4-8 nodes is the limit with more or less default > tunings. > >> From the deployments I've seen on the mailing list and bug reports, the most common clusters appear to be around the 2-6 node mark. > > Which seems consistent. > >> The messaging involved with keeping the all the local resource operations in the CIB synced across that many nodes is pretty insane. > > Indeed, and I most certainly had considered that. What really threw a > curve in that train of thought for me though was that even idle, > non-CIB-modifying nodes (i.e. turning a working 4 node cluster into a > non-working 8 node cluster by adding 4 nodes that do nothing with the > CIB) can tip a working configuration over into non-working. > > I could most certainly see how the contention of 8 nodes all trying to > jam stuff into the CIB might be taxing with all of the locking that > needs to go on, etc, but for those 4 added idle nodes to add enough > complexity to make an working 4 node cluster not work is puzzling. > Puzzling enough (granted, to somebody who knows zilch about the > messaging that goes on with CIB operations) to make is smell more like a > bug than simple contention. > >> If you are set on using pacemaker, > > Well, I am not necessarily married to it. It did just seem like the > tool with the critical mass behind it. As sketchy as it might seem to > ask, (and I only am since you seem to be hinting that there might be a > better tool for the job) is there a tool more suited to the job? > >> the best approach for scaling for your situation would probably be to try and figure out how to break nodes into smaller clusters that are easier to manage. > > Indeed, that is what I ended up doing. Now my 16 node cluster is 4 4 > node clusters. The problem with that though, is that when a node in a > cluster fails, it has only 3 other nodes to spread it's resources around > onto, and if 2 should fail, 2 nodes are trying to service twice their > normal load. The benefit of larger clusters is clear. in giving > pacemaker more nodes to evenly distribute resources to, impacting the > load of other the other nodes minimally when one or more nodes of the > cluster do fail. > >> I have not heard of a single deployment as large as you are thinking of. > > Heh. Not atypical of me to push the envelope I'm afraid. :-/ > > Cheers, and many thanks for your input. It is valuable to this discussion. > > b. > > > _______________________________________________ > Pacemaker mailing list: Pacemaker [at] oss > http://oss.clusterlabs.org/mailman/listinfo/pacemaker > > Project Home: http://www.clusterlabs.org > Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf > Bugs: http://bugs.clusterlabs.org > _______________________________________________ Pacemaker mailing list: Pacemaker [at] oss http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
|