Login | Register For Free | Help
Search for: (Advanced)

Mailing List Archive: RSyslog: users

Load balancing for rsyslog aggregators

 

 

RSyslog users RSS feed   Index | Next | Previous | View Threaded


radu0gheorghe at gmail

Jan 30, 2012, 7:10 AM

Post #1 of 10 (161 views)
Permalink
Load balancing for rsyslog aggregators

Hello,

I'm currently using rsyslog in the following setup:
- a bunch of clients send Syslog messages to a central rsyslog daemon via TCP
- right now, the central rsyslog daemon pipes the logs to both a plain
text file and an external program. But I guess that's not so relevant

Soon, this central rsyslog would be overwhelmed by the amount of
logging from the clients. And I'm looking for a solution to be able to
deploy new "central" rsyslog daemons.

I thought about using DNS round robin:
- configure clients to send logs to a single hostname
- once a new "central" rsyslog is added, add it to DNS
Disadvantages to this are specific to DNS round-robin:
- load is not balanced if "central" servers are not the same
- there is a lag due to DNS caching

And I guess another solution is to have a script to run after
deployment, which will change the rsyslog.conf on all the clients. But
that's doesn't seem to be a good idea, especially since there's no way
to actually balance load on the aggregators. Only to make some clients
to log to one, others to another, etc.

Do you have any thoughts on how to solve the problem?
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


rgerhards at hq

Jan 30, 2012, 7:37 AM

Post #2 of 10 (158 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

How many messages does the central server process per second? Does the
problem persists if you do not pipe to the external program? If so, you could
load-balance just that part.

rainer

> -----Original Message-----
> From: rsyslog-bounces [at] lists [mailto:rsyslog-
> bounces [at] lists] On Behalf Of Radu Gheorghe
> Sent: Monday, January 30, 2012 4:11 PM
> To: rsyslog [at] lists
> Subject: [rsyslog] Load balancing for rsyslog aggregators
>
> Hello,
>
> I'm currently using rsyslog in the following setup:
> - a bunch of clients send Syslog messages to a central rsyslog daemon
> via TCP
> - right now, the central rsyslog daemon pipes the logs to both a plain
> text file and an external program. But I guess that's not so relevant
>
> Soon, this central rsyslog would be overwhelmed by the amount of
> logging from the clients. And I'm looking for a solution to be able to
> deploy new "central" rsyslog daemons.
>
> I thought about using DNS round robin:
> - configure clients to send logs to a single hostname
> - once a new "central" rsyslog is added, add it to DNS
> Disadvantages to this are specific to DNS round-robin:
> - load is not balanced if "central" servers are not the same
> - there is a lag due to DNS caching
>
> And I guess another solution is to have a script to run after
> deployment, which will change the rsyslog.conf on all the clients. But
> that's doesn't seem to be a good idea, especially since there's no way
> to actually balance load on the aggregators. Only to make some clients
> to log to one, others to another, etc.
>
> Do you have any thoughts on how to solve the problem?
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


david at lang

Jan 30, 2012, 12:43 PM

Post #3 of 10 (160 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

On Mon, 30 Jan 2012, Radu Gheorghe wrote:

> Hello,
>
> I'm currently using rsyslog in the following setup:
> - a bunch of clients send Syslog messages to a central rsyslog daemon via TCP
> - right now, the central rsyslog daemon pipes the logs to both a plain
> text file and an external program. But I guess that's not so relevant
>
> Soon, this central rsyslog would be overwhelmed by the amount of
> logging from the clients. And I'm looking for a solution to be able to
> deploy new "central" rsyslog daemons.
>
> I thought about using DNS round robin:
> - configure clients to send logs to a single hostname
> - once a new "central" rsyslog is added, add it to DNS
> Disadvantages to this are specific to DNS round-robin:
> - load is not balanced if "central" servers are not the same
> - there is a lag due to DNS caching
>
> And I guess another solution is to have a script to run after
> deployment, which will change the rsyslog.conf on all the clients. But
> that's doesn't seem to be a good idea, especially since there's no way
> to actually balance load on the aggregators. Only to make some clients
> to log to one, others to another, etc.
>
> Do you have any thoughts on how to solve the problem?

what I do is use iptables CLUSTERIP feature on linux to setup one IP
address that gets shared across the cluster of systems. heartbeat (with
the pacemaker cluster management layer) can keep track of the cluster and
make sure that there is always a box handling the traffic

what this does is use a multicast MAC address to send the traffic to
multiple machines. The kernel then does a hash on (one or more of) source
IP, source port, destination IP, destination port. It then divides this
hash into buckets (I am machine 1 of 10) and if it falls into the bucket
for this machine, it then sends the packet on to the application,
otherwise the kernel drops the packet.

This has the advantage of not needing any other systems, it can be done
entirely on the receiving cluster.


Another option you could do with TCP traffic is to setup a LVS (Linux
Virtual Server) load balancer (or run it through any commercial load
balancer)


In any of these configurations, you will want to consider the
tcprebindinterval config option of rsyslog on the sending machines so that
they will periodically close and re-open their connection (so that the
source port changes), otherwise you can end up with the traffic being
unbalanced between your systems without any way to re-balance the load.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


david at lang

Jan 30, 2012, 12:52 PM

Post #4 of 10 (161 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

On Mon, 30 Jan 2012, david [at] lang wrote:

>> Do you have any thoughts on how to solve the problem?
>
> what I do is use iptables CLUSTERIP feature on linux to setup one IP address
> that gets shared across the cluster of systems. heartbeat (with the pacemaker
> cluster management layer) can keep track of the cluster and make sure that
> there is always a box handling the traffic
>
> what this does is use a multicast MAC address to send the traffic to multiple
> machines. The kernel then does a hash on (one or more of) source IP, source
> port, destination IP, destination port. It then divides this hash into
> buckets (I am machine 1 of 10) and if it falls into the bucket for this
> machine, it then sends the packet on to the application, otherwise the kernel
> drops the packet.
>
> This has the advantage of not needing any other systems, it can be done
> entirely on the receiving cluster.

here is a page on how to configure pacemaker to do this.

www.clusterlabs.org/doc/en-US/Pacemaker/1.1/html/Clusters_from_Scratch/ch08s06.html

I primarily use this with UDP traffic, which gives me the added advantage
that I can have multiple clusters receiving the same traffic. I did
extensive testing a couple of years ago, and going across a cisco 3500
switch I was able to handle traffic up to ~380K logs/sec (~250 byte log
messages) with no message losses with UDP over several billion log
messages sent to a dozen destination machines.

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


radu0gheorghe at gmail

Jan 31, 2012, 1:07 AM

Post #5 of 10 (157 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

Thanks a lot, David. These solutions seem much better than DNS round robin.

@Rainer: I'm having trouble with setting up more exact performance
tests, but the system is supposed to scale to something like 50K
messages per second. And I just assumed that one sever won't handle
the load, especially since these machines are slow.

But you have a good point, because the external program inserts these
logs in ElasticSearch. And I have ElasticSearch on the same server
right now, which I don't have to. Using a dedicated server for Rsyslog
only might not cut it in the long run, but it would probably work well
for a while. More than enough for me to set up a cluster :D

So, thanks again. I consider the issue "solved".

Best regards,
Radu
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


rgerhards at hq

Jan 31, 2012, 1:23 AM

Post #6 of 10 (157 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

> -----Original Message-----
> From: rsyslog-bounces [at] lists [mailto:rsyslog-
> bounces [at] lists] On Behalf Of Radu Gheorghe
> Sent: Tuesday, January 31, 2012 10:08 AM
> To: rsyslog-users
> Subject: Re: [rsyslog] Load balancing for rsyslog aggregators
>
> Thanks a lot, David. These solutions seem much better than DNS round
> robin.
>
> @Rainer: I'm having trouble with setting up more exact performance
> tests, but the system is supposed to scale to something like 50K
> messages per second. And I just assumed that one sever won't handle
I do 100kmsgs/second on a 2 year old travel notebook regularly, even in a
vmware environment with just an "old" intel duo-core notebook processor... ;)

Rainer

> the load, especially since these machines are slow.
>
> But you have a good point, because the external program inserts these
> logs in ElasticSearch. And I have ElasticSearch on the same server
> right now, which I don't have to. Using a dedicated server for Rsyslog
> only might not cut it in the long run, but it would probably work well
> for a while. More than enough for me to set up a cluster :D
>
> So, thanks again. I consider the issue "solved".
>
> Best regards,
> Radu
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


david at lang

Jan 31, 2012, 2:48 AM

Post #7 of 10 (160 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

On Tue, 31 Jan 2012, Radu Gheorghe wrote:

> Thanks a lot, David. These solutions seem much better than DNS round robin.
>
> @Rainer: I'm having trouble with setting up more exact performance
> tests, but the system is supposed to scale to something like 50K
> messages per second. And I just assumed that one sever won't handle
> the load, especially since these machines are slow.
>
> But you have a good point, because the external program inserts these
> logs in ElasticSearch. And I have ElasticSearch on the same server
> right now, which I don't have to. Using a dedicated server for Rsyslog
> only might not cut it in the long run, but it would probably work well
> for a while. More than enough for me to set up a cluster :D

the best thing that you could do for your performance is to commission the
writng of an output module that would let rsyslog insert the messages into
ElasticSearch instead of doing it with an external program.

At first glance this may seem like a trivial change, but the killer
feature that you can take advantage of with an output module is the
ability to handle multiple log messages as a single transaction.

I'm not familiar with ElasticSearch, but it's common for databases to be
able to handle inserts of 100 or even 1000 records as a single transaction
at exactly the same transaction/sec rate as inserting a single record per
transaction (or at a very slight reduction in insert rate). I've seen good
database setups where 10,000 inserts as a single transaction was only 1/2
the transaction rate of one insert per transaction (a 5,000x speedup)

Adiscon does this sort of work (contact Rainer directly if you want a
quote)

but 50K logs/sec is not likely to end up with rsyslog as the bottlneck.
You should setup a test environment and stress test things to see how high
you can push the message rate before you can't keep up. There are a number
of variables that can end up being the bottleneck and you want to find
these in testing, not in production :-)

The fist thing is that you want to be running a very recent rsyslog (5.8.x
or 6.x), the speedups in rsyslog since 4.x (which is in RHEL5 I believe)
are very significant. 6.3.x introduces a DNS cache that can be a drastic
speedup if you need DNS lookups (if not, you can start rsyslog with -x to
disable them on earlier versions)

you also need to define 'slow hardware', one person's slow hardware is
another person's mid-range server :-)

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


vladg at illinois

Jan 31, 2012, 6:36 AM

Post #8 of 10 (157 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 1/31/12 4:48 AM, david [at] lang wrote:
> the best thing that you could do for your performance is to commission the writng of an output module that would let rsyslog insert the messages into ElasticSearch instead of doing it with an external program.

There's already a user-submitted output module available for this:

<http://article.gmane.org/gmane.comp.sysutils.rsyslog/5461/>
<http://article.gmane.org/gmane.comp.sysutils.rsyslog/5462/>
<http://article.gmane.org/gmane.comp.sysutils.rsyslog/5463/>

--Vlad
-----BEGIN PGP SIGNATURE-----
Version: GnuPG/MacGPG2 v2.0.18 (Darwin)

iQIcBAEBAgAGBQJPJ/xYAAoJEMEVj6tjLlJyn0IP/2wgPDovzA7mpTX4I9ok2e+p
3jBdenCMxgpKihcIo9ssA439w0AxQtesZgxmN5zsc6kW/bifQEimTHw6gKTj5jWH
lZJMESQH622IGqCnj05vJdurJIYu/EJV0stqH+b7cotpnnRTUEPtIYjZhMebnWUy
+AgYdarFF6uJeFlvNNrtU7Sfx5X13b6qIkS4ESVzhiLts8UT8Onv7XekVAngrObg
RCy4n/cZQ/p4g9KLZa6Y8W03SkytWAzFR3n975ZWJcA47HTYaZYWSD8Xu0F2Hs66
smeq4yv5qYjRvF52CH5Mg6/jsgWEc2opn0qWamUqY3Cu0R/LV4d1JWEDEiVVXcYq
Ig6mOqPqKb5TgBSN7JMC6uFxTHA/WH+rv3yJEr928mMeP7444sfqg9Afq9Q4xai2
ZJSia1fxUyk7B+shrQ6/kXYYzpZCREH0Th7MHYFEowne4q/SRAYf+R+i7DbjDM4H
+Ifo8BYKN9PphP7kLTqaA3XSQ6JjaR1QYHFpvtE0jqFBn6CQzi5HEIqoHL5Yv3K6
dvLtr5zQEAXVKxAYERDSy4sfCjADBxd+E3gijWyVyR/3BwaykbeSC4BO+/wxU9HV
pJ7i0n1/uMbKzBDxkEnavqyi//+16UF/J0qg+2NfvMpzqTJZ5Z2Bcp3b0hI1uqh3
zDtLKC5YyAEZHzTSDynW
=Wq6z
-----END PGP SIGNATURE-----
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


rgerhards at hq

Jan 31, 2012, 7:00 AM

Post #9 of 10 (157 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

I recently merged it, so it is in git for all versions (for released ones,
see ChangeLog).

rainer

> -----Original Message-----
> From: rsyslog-bounces [at] lists [mailto:rsyslog-
> bounces [at] lists] On Behalf Of Vlad Grigorescu
> Sent: Tuesday, January 31, 2012 3:36 PM
> To: rsyslog-users
> Subject: Re: [rsyslog] Load balancing for rsyslog aggregators
>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 1/31/12 4:48 AM, david [at] lang wrote:
> > the best thing that you could do for your performance is to commission
the
> writng of an output module that would let rsyslog insert the messages into
> ElasticSearch instead of doing it with an external program.
>
> There's already a user-submitted output module available for this:
>
> <http://article.gmane.org/gmane.comp.sysutils.rsyslog/5461/>
> <http://article.gmane.org/gmane.comp.sysutils.rsyslog/5462/>
> <http://article.gmane.org/gmane.comp.sysutils.rsyslog/5463/>
>
> --Vlad
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG/MacGPG2 v2.0.18 (Darwin)
>
> iQIcBAEBAgAGBQJPJ/xYAAoJEMEVj6tjLlJyn0IP/2wgPDovzA7mpTX4I9ok2e+p
> 3jBdenCMxgpKihcIo9ssA439w0AxQtesZgxmN5zsc6kW/bifQEimTHw6gKTj5jW
> H
> lZJMESQH622IGqCnj05vJdurJIYu/EJV0stqH+b7cotpnnRTUEPtIYjZhMebnWUy
> +AgYdarFF6uJeFlvNNrtU7Sfx5X13b6qIkS4ESVzhiLts8UT8Onv7XekVAngrObg
> RCy4n/cZQ/p4g9KLZa6Y8W03SkytWAzFR3n975ZWJcA47HTYaZYWSD8Xu0F2Hs
> 66
> smeq4yv5qYjRvF52CH5Mg6/jsgWEc2opn0qWamUqY3Cu0R/LV4d1JWEDEiVV
> XcYq
> Ig6mOqPqKb5TgBSN7JMC6uFxTHA/WH+rv3yJEr928mMeP7444sfqg9Afq9Q4x
> ai2
> ZJSia1fxUyk7B+shrQ6/kXYYzpZCREH0Th7MHYFEowne4q/SRAYf+R+i7DbjDM4
> H
> +Ifo8BYKN9PphP7kLTqaA3XSQ6JjaR1QYHFpvtE0jqFBn6CQzi5HEIqoHL5Yv3K6
> dvLtr5zQEAXVKxAYERDSy4sfCjADBxd+E3gijWyVyR/3BwaykbeSC4BO+/wxU9
> HV
> pJ7i0n1/uMbKzBDxkEnavqyi//+16UF/J0qg+2NfvMpzqTJZ5Z2Bcp3b0hI1uqh3
> zDtLKC5YyAEZHzTSDynW
> =Wq6z
> -----END PGP SIGNATURE-----
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/


radu0gheorghe at gmail

Jan 31, 2012, 7:01 AM

Post #10 of 10 (157 views)
Permalink
Re: Load balancing for rsyslog aggregators [In reply to]

2012/1/31 <david [at] lang>:
> the best thing that you could do for your performance is to commission the
> writng of an output module that would let rsyslog insert the messages into
> ElasticSearch instead of doing it with an external program.
>
> At first glance this may seem like a trivial change, but the killer feature
> that you can take advantage of with an output module is the ability to
> handle multiple log messages as a single transaction.
>
> I'm not familiar with ElasticSearch, but it's common for databases to be
> able to handle inserts of 100 or even 1000 records as a single transaction
> at exactly the same transaction/sec rate as inserting a single record per
> transaction (or at a very slight reduction in insert rate). I've seen good
> database setups where 10,000 inserts as a single transaction was only 1/2
> the transaction rate of one insert per transaction (a 5,000x speedup)
>

I know there is an ES plugin available in the development version but
I couldn't get it to work:
http://kb.monitorware.com/can-install-elasticsearch-output-module-t11309.html

My script does bulk inserts already (I'm inserting each second). So
there shouldn't be a significant performance gain by using an rsyslog
plugin. Although I would prefer using plugins anyway.

> Adiscon does this sort of work (contact Rainer directly if you want a quote)
>
> but 50K logs/sec is not likely to end up with rsyslog as the bottlneck. You
> should setup a test environment and stress test things to see how high you
> can push the message rate before you can't keep up. There are a number of
> variables that can end up being the bottleneck and you want to find these in
> testing, not in production :-)

Yes, I will do some proper testing and consider solutions afterwards.
Sorry for not doing my homework properly in the first place :(

>
> The fist thing is that you want to be running a very recent rsyslog (5.8.x
> or 6.x), the speedups in rsyslog since 4.x (which is in RHEL5 I believe) are
> very significant. 6.3.x introduces a DNS cache that can be a drastic speedup
> if you need DNS lookups (if not, you can start rsyslog with -x to disable
> them on earlier versions)
>
> you also need to define 'slow hardware', one person's slow hardware is
> another person's mid-range server :-)

I guess defining 'slow hardware' must come after proper testing... So
I won't go there for now :)
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/

RSyslog users RSS feed   Index | Next | Previous | View Threaded
 
 


Interested in having your list archived? Contact Gossamer Threads
 
  Web Applications & Managed Hosting Powered by Gossamer Threads Inc.