
jira at apache
Apr 24, 2012, 6:10 PM
Post #4 of 7
(96 views)
Permalink
|
|
[jira] [Commented] (SOLR-3393) Implement an optimized LFUCache
[In reply to]
|
|
[ https://issues.apache.org/jira/browse/SOLR-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13261197#comment-13261197 ] Hoss Man commented on SOLR-3393: -------------------------------- bq. I will attempt to make a new O(1) cache called FastLFUCache {{#OhDearGodPleaseNotAnotherClassWithFastInTheName}} Please, please, please lets end the madness of subjective adjectives in class names ... if it's an LFU cache wrapped around a "hawtdb" why don't we just call it "HawtDbLFUCache" ? bq. I've been working on this. I've come to realize that I don't completely understand how CacheRegenerator works. I suspect that it is geared around LRU caches and that the new cache won't have any of the frequency information from the old one, it will just put the entries into the cache as if they were new. Can anyone confirm this? The idea behind the CacheRegenerator API is to be as simple as possible and agnostic to: * the Cache Impl (ie: LRUCache vs LFUCache vs HawtDbLFUCache) * the cache usage (ie: Query->DocSets vs Query->DocList vs String->MyCustomClass) * the means of generating values from keys (ie: how do you know which MyCustomClass should be cached for which String) ... so you can have a custom (named) cache instance declared in your solrconfig.xml with your own MySpecialCacheRegenerator that knows about your usecase and might do something special with the keys/values (like: short-circut part of the generation if it can see the data hasn't changed, or read from authoritative data files outside of solr, etc...) and then use *any* Cache impl class that you're heart desires, and things will still work right. bq. After the new cache is regenerated, should I go through the new cache, grab the frequency information from the old cache with each key, and fix the new cache up? you certainly could -- when {{(new HawtDbLFUCache(...)).warm(...)}} is called, it needs to delegate to the regenerator for pulling values from the "old" cache, but that doesn't mean it can't also directly ask the "old" cache instance for stats about each of the old keys as it loops over them -- remember: the "new" cache is the one inspecting the "old" cache and deciding what things to ask the regenerator to generate. But i question whether you really want any sort of stats from the "old" cache copied over to the "new" cache. it is, after all, a completely new cache -- with new usage. should the stats really be preserved forever? regardless of how popular an object was in the "old" cache instance, should we automatically assume it's equally popular in the "new" cache instance? > Implement an optimized LFUCache > ------------------------------- > > Key: SOLR-3393 > URL: https://issues.apache.org/jira/browse/SOLR-3393 > Project: Solr > Issue Type: Improvement > Components: search > Affects Versions: 3.6, 4.0 > Reporter: Shawn Heisey > Priority: Minor > Fix For: 4.0 > > Attachments: SOLR-3393.patch, SOLR-3393.patch > > > SOLR-2906 gave us an inefficient LFU cache modeled on FastLRUCache/ConcurrentLRUCache. It could use some serious improvement. The following project includes an Apache 2.0 licensed O(1) implementation. The second link is the paper (PDF warning) it was based on: > https://github.com/chirino/hawtdb > http://dhruvbird.com/lfu.pdf > Using this project and paper, I will attempt to make a new O(1) cache called FastLFUCache that is modeled on LRUCache.java. This will (for now) leave the existing LFUCache/ConcurrentLFUCache implementation in place. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscribe [at] lucene For additional commands, e-mail: dev-help [at] lucene
|