david at lang
Aug 6, 2012, 4:59 PM
Post #4 of 4
On Mon, 6 Aug 2012, Chastity Blackwell wrote:
> On Fri, 2012-08-03 at 16:29 -0400, david [at] lang wrote:
>> On Fri, 3 Aug 2012, Chastity Blackwell wrote:
>>> We're looking at doing some load balancing for our rsyslog
>>> infrastructure and as part of that, obviously, I'd like to use the
>>> tcprebindinterval directive; however, I can't seem to find usage
>>> examples or syntax on the wiki or elsewhere from a google search. Does
>>> anyone have just a quick snippet from a conf file they can show me?
>> It's simple
>> $tcprebindinterval number
>> where number is the number of messages sent before disconnecting and
>> I would suggest setting this number relativly high, reconnecting once per
>> second or so is more than you normally need to load balance reasonably and
>> avoids wasting too much time reconnecting.
> Thanks -- I'm assuming this is a per queue setting, so you'd want to set
> it higher for say, a queue handling access logs for a a high-traffic
> webserver and lower for system logs in general?
Like all other paramters in rsyslog, it affects the queue that you are
currently configuring and any future queues, so if you have a whole bunch
of separate action queues you can set them differently. If you don't have
a lot of action queues, the paramter will be across the board.
However, in practice, you really don't need to worry that much about it.
Rsyslog can receive messages _very_ quickly, the bottleneck is almost
always in processing/delivering the messages. As long as your main queue
on the receivng boxes can handle the burst size you are in good shape.
And as far as lower volume traffic goes, if it's low volume, you shouldn't
need to worry much about load balancing it, right :-)
This isn't trying to do 'perfect' load balancing where all recievers get
exactly the same number of messages, it's allowing you to do 'statistical'
load balancing where they will all recieve about the same number of
messages over time.
I have rsyslog load balancing across a farm of 10 machines (splunk
servers), where I'm load balancing not because the recievers can't handle
the rate of inbound messages, but because I want to have about the same
number of messages on each system so that when I do searches across the
logs, each system has about the same amount of work to do.
I just have one parameter set on the senders, I don't try to do different
things for different types of logs, and over time (a day) the servers are
so close to having the same number of log messages that on 150G of
logs/day (15G per server) the difference in the size of the log files is
well under 1M (0.001% variation).
In hindsight, there really isn't much need to have this as a configurable
parameter, if this was a boolean switch that reconnected after every
thousand (or even 10K) messages it would work for well over 99% of cases.
At one reconnect per 1K messages, the overhead of the reconnect is
minimal, and reconnecting every 1K or 10K messages is more than good
enough to spread the logs across the receiving boxes.
Where this would fall down is if there is a HUGE overhead in processing
each log file, AND the logs are arriving relatively slowly so that one box
would be unable to process the burst of messages, or the lag in processing
that many messages on one box would become unacceptably larger.
We didn't know this when we created this feature, so we went with the
easier to implement, and more flexible option of letting the user set the
rsyslog mailing list
What's up with rsyslog? Follow https://twitter.com/rgerhards