
radu0gheorghe at gmail
Apr 11, 2012, 12:33 AM
Post #12 of 17
(688 views)
Permalink
|
2012/4/10 Vlad Grigorescu <vladg [at] illinois>: > The thing to consider here is what happens when you have multiple rsyslog servers logging to ElasticSearch. Does there need to be some kind of concurrency, so that each of them have unique IDs for the messages? What happens if two messages have the same ID? If two messages have the same ID, the one that gets inserted last overrides the previous one, and gets an incremented _version. Which basically means you lose data, because the old message isn't there anymore. > > These are questions I'm unsure of, but for now, I'm happy to use ElasticSearch's automatic ID generation features. Well, if you rely on Elasticsearch to generate the IDs, I don't think there's a way for rsyslog to know which documents were successfully inserted and which not: # curl -XPUT 'http://localhost:9200/test2/' {"ok":true,"acknowledged":true} # curl -XPUT 'http://localhost:9200/test2/type1/_mapping' -d ' { "type2" : { "properties" : { "field1" : {"type" : "long"} } } } ' {"ok":true,"acknowledged":true} # cat requests { "index" : { "_index" : "test2", "_type" : "type1" } } { "field1" : 1 } { "index" : { "_index" : "test2", "_type" : "type1" } } { "field1" : "bla" } { "index" : { "_index" : "test2", "_type" : "type1" } } { "field1" : 3 } # curl -s -XPOST localhost:9200/_bulk --data-binary @requests; echo {"took":29,"items":[{"create":{"_index":"test2","_type":"type1","_id":"F5a5Rxt1RCSLXQ0N7wV4_w","_version":1,"ok":true}},{"create":{"_index":"test2","_type":"type1","_id":"vU07l91nQu-Nx9xLoextrA","error":"MapperParsingException[Failed to parse [field1]]; nested: NumberFormatException[For input string: \"bla\"]; "}},{"create":{"_index":"test2","_type":"type1","_id":"q2uJUEleRTmVv0jGoPxZkQ","_version":1,"ok":true}}]} The only way to know which document was inserted and which not is by order. Which looks a bit risky in my book. > > --Vlad > > On 04/10/2012 09:49 AM, Radu Gheorghe wrote: >> 2012/4/10 <david [at] lang>: >>> On Tue, 10 Apr 2012, Vlad Grigorescu wrote: >>> >>>> a) Messages that didn't get successfully inserted should probably be >>>> queued and reattempted once or twice before being discarded. Unfortunately, >>>> the new transactional interface won't be sufficient here - if messages 1, 2, >>>> 4, and 5 are successfully inserted, but message 3 fails, as far as I know, >>>> there's no way in the transactional interface to communicate that only >>>> message 3 failed, instead of message 3-5. >>> >>> >>> actually, what happens is that rsyslog sends a transaction and gets a single >>> success or failure message. >>> >>> if success, all messages were inserted >>> >>> if failure, it tries again with half as many messages to see if that goes >>> through. If it gets down to one message and that fails, then it considers it >>> a failure (and either retries, or drops the failed message) >>> >>> so if elasticsearch doesn't have transactions (all or none succeed), then >>> some messages will be inserted multiple times. >> >> Maybe a solution to this is to use IDs somehow to avoid entering >> duplicates. Trying to add the same bulk (with the same IDs) will only >> "update" existing documents, and increment the "_version" number. >> >> I'm not sure how this could actually be implemented, but it might be an option. >> >> BTW, I'm also interested in Elasticsearch :). But since I'm using it >> for logs, I'm not so much affected by duplicates. >> _______________________________________________ >> rsyslog mailing list >> http://lists.adiscon.net/mailman/listinfo/rsyslog >> http://www.rsyslog.com/professional-services/ > > -- > Vlad Grigorescu | IT Security Engineer > Office of Privacy and Information Assurance > University of Illinois at Urbana-Champaign > 0x632E5272 | 217.244.1922 > _______________________________________________ > rsyslog mailing list > http://lists.adiscon.net/mailman/listinfo/rsyslog > http://www.rsyslog.com/professional-services/ _______________________________________________ rsyslog mailing list http://lists.adiscon.net/mailman/listinfo/rsyslog http://www.rsyslog.com/professional-services/
|