
andrew at beekhof
May 9, 2012, 9:15 PM
Post #4 of 4
(219 views)
Permalink
|
|
Re: start/stop operations fail to happen in parallel on resources
[In reply to]
|
|
On Fri, Apr 20, 2012 at 12:30 AM, David Vossel <dvossel [at] redhat> wrote: > ----- Original Message ----- >> From: "Parshvi" <parshvi.17 [at] gmail> >> To: pacemaker [at] clusterlabs >> Sent: Thursday, April 19, 2012 6:22:01 AM >> Subject: [Pacemaker] start/stop operations fail to happen in parallel on resources >> >> Observations: >> max-children=30 >> total no. of resources=18 >> >> 1) At a default value 4 of max-children, following logs were observed >> that led to monitor ops timeout for some resources (a total of 18 >> rscs): >> a. max_child_count (4) reached, postponing execution of operation >> monitor >> b. WARN: perform_ra_op: the operation operation monitor[18] on >> ocf::IPaddr2::ClusterIP for client 3754, stayed in operation list for >> 14100 ms (longer than 10000 ms) >> c. SOLUTION: the max-children of lrmd was raised to 30. >> d. ISSUES STILL OBSERVED: while 2-3 resources are stuck in start >> operation, >> if a rsc is issued an explicit start command `crm resource start >> rcs1`, then the >> start op on this rsc is delayed until any one of the previous >> resources exit >> from their start operation. >> > > This is what I would expect to happen. If a operation is in flight at the same time you make a configuration change, I don't believe the change will be looked at until the operation returns or times out. Correct. We wait for any in-flight operations to complete but do not initiate any more. You can also set batch-limit to prevent pacemaker from sending "too many" operations to the lrmd in the first place, but setting max-children to 30 on a decent machine doesn't seem unreasonable. _______________________________________________ Pacemaker mailing list: Pacemaker [at] oss http://oss.clusterlabs.org/mailman/listinfo/pacemaker Project Home: http://www.clusterlabs.org Getting started: http://www.clusterlabs.org/doc/Cluster_from_Scratch.pdf Bugs: http://bugs.clusterlabs.org
|