
vitalyb at telenet
Apr 23, 2012, 12:28 PM
Post #3 of 4
(213 views)
Permalink
|
23.04.2012 20:33, Kevin A. McGrail пишет: > If you are using an SQL backend for your AWL, Kris Deugau gave me some ideas a few years ago where I added a column to > the AWL like such: > > alter table awl add column `lastupdate` timestamp NOT NULL default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP > > > Then I run a cron job in AWL to clear out the entries that haven't been used in a while on a gradiated scale: > > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 15 day) and count < 5; > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 30 day) and count < 10; > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 60 day) and count < 20; > DELETE FROM awl WHERE lastupdate <= (now() - INTERVAL 120 day); No, mine is BerkeleyDB and looks like sa-awl can handle DB-Files only. I have a feeling that SQL for this task would have a huge overall overhead with no clear benefits of being SQL. > However, the idea below might be the only option for DB-Based backend. Why don't you open a ticket, please? I think > it's a straightforward change and has good benefits. Might need to be a switch to control it's behavior for those who > care about time more than memory, though. I thought dev-list would be appropriate place to send a patch, haven't found clear guidelines, sorry. Submitting a ticket now. Just for reference, "12% slower" in benchmark I performed is a 26 seconds vs 29 seconds case and yet it saves 1G of RAM. Two-three times larger database and sa-awl process won't even fit in a 32-bit virtual address space. > Regards, > KAM > > On 4/22/2012 12:58 PM, Vitaly V. Bursov wrote: >> Hello, >> >> current version of sa-awl loads full database key list to memory before showing any >> stats or performing maintenance. I believe it's obvious that this behavior is >> undesirable and makes large databases impossible to handle. >> >> The patch below improves sa-awl scaling and responsiveness by scanning database >> row-by-row basis instead of loading all keys to memory first. >> >> Tested cleaning db with over 8 million rows. >> >> For a cached db with 850K rows memory usage lowers from 1G to 6M, execution time >> is around 12% slower, though. >> >> I'm not a perl expert, please review. >> >> Thanks. >> >> --- sa-awl.orig 2012-04-22 18:38:55.000000000 +0300 >> +++ sa-awl 2012-04-22 18:59:10.527228442 +0300 >> @@ -82,11 +82,10 @@ >> or die "Cannot open file $db: $!\n"; >> } >> >> -my @k = grep(!/totscore$/,keys(%h)); >> -for my $key (@k) >> +while (my ($key, $count) = each %h) >> { >> + next if $key =~ /totscore$/; >> my $totscore = $h{"$key|totscore"}; >> - my $count = $h{$key}; >> next unless defined($totscore); >> >> if ($opt_clean) { > > > -- > *Kevin A. McGrail* > President > > Peregrine Computer Consultants Corporation > 3927 Old Lee Highway, Suite 102-C > Fairfax, VA 22030-2422 > > http://www.pccc.com/ > > 703-359-9700 x50 / 800-823-8402 (Toll-Free) > 703-359-8451 (fax) > KMcGrail [at] PCCC <mailto:kmcgrail [at] pccc> >
|