Gossamer Forum
Home : General : Perl Programming :

Good Programmers vs. Reading logs

Quote Reply
Good Programmers vs. Reading logs
I would like to write a routine for reading logs but it enters into a loop and the CPU is almost completely taken. Can you suggest something?

-------
open(UIL,"$logs");
@in=<UIL>;

foreach $on (@in){
split(/\|/,$on);
push(@a, $_[0]);
push(@b, $_[0]);}
foreach $c (@a){
$x=0;
foreach $y (@z){
if ($c eq $y){$k=$c;}}
if ($c ne $k){

foreach $d (@b){
if ($d eq $c){
push(@z,"$d");$x++;}}}
if ($x >0){
open(T,">>report/log.rep");print T"$x|$c|\n";}
}
open(T,"report/log.rep");
@T=<T>;
foreach $l (@T){
@T1=split(/\|/,$l);
open (TMP2,"file.html");
while (<TMP2>) {$_=~s/\[date\]/$T1[0]/g;$_=~s/\[calls\]/$T1[1]/g;
print $_;}
}

-------

This counts the no. of call per day ($_[0]) but when the log file is too big (i.e. 2Mb) it cannot generate the report. It seems a loop :(

I read the log file, then I push the dates into 2 arrays. Then I compare them: I count how many times the date is repeated. The results are printed in a report file (rep.log).

Sorry. This is the first time I make something like this ... on big files!

Thanks in advance.

Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Why not just construct a hash, with the key being the date, and increase the value each time something with that date is found?

$blah{20001021}++; # something like that

also, you REALLY need to come up with better names for your variables. it is a bad bad thing to use $a, $b, @c, @d, etc...

--mark

Installation support is provided via ICQ at UIN# 53788453. I will only respond on that number.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
I would strongly appreciate an example ... I'm still having the problem :(

Here is an example of the original log file (the real file is around 2Mb):

-----
10/15/2000|hour|ip
10/15/2000|hour|ip
10/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
----

The rep.log should be like this:

----
10/15/2000|3
11/15/2000|4
----

Again, the routine works on small files. You should try it with a big file. It is amazing (it crashes servers) .... but I cant get the error. Sorry.


Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Something like:

Code:
my %stats;
open (FILE, "< /path/to/file") or die "Can't open: $!";
while (<FILE>) {
chomp;
my ($date, $hr, $ip) = split /\|/;
$stats{$date}++;
}
close FILE;

while (my ($date, $count) = each %stats) {
print "$date|$count\n";
}
Hope that helps,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Alex, that works ... great!! Thanks very much.
I have only one problem now. How can I sort by date, for example?

Again, thank you for your help.

Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Change that too:

Code:
my (%stats, @dates);
open (FILE, "< /path/to/file") or die "Can't open: $!";
while (<FILE>) {
chomp;
my ($date, $hr, $ip) = split /\|/;
push (@dates, $date) unless ($stats{$date}++);
$stats{$date}++;
}
close FILE;

foreach my $date (@dates) {
print "$date|$stats{$date}\n";
}
Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
It doesn't work.

date|2
date|2
etc.

always the same number.

thanks.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Code:
push (@dates, $date) unless ($stats{$date}++);
$stats{$date}++;
The second line ($stats{$date}++;) will cause the counter to increment a second time for each entry, no? (or am I just missing something?).

I would probably just do it as: (untested)

Code:
my %stats;

open (FILE, '/path/to/file') or die $!;
while (<FILE>) {
split /\|/;
$stats{$_[0]}++;
}
close FILE;

print "$_|$stats{$_}\n" foreach sort {$a cmp $b} keys %stats;
Just to avoid making an array also. Doesn't really matter though (the beauty of Perl, TMTOWTDI). (The above is untested so may need a tweak to actually run... dunno if I typo'd or anything :))

--mark






Installation support is provided via ICQ at UIN# 53788453. I will only respond on that number.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
it is not exactly "2"

"2" can be "20" or "20000". It is just the last
$stats{$_[0]}


... and this is not correct.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Why isn't it correct? Testing locally (using __DATA__ instead of a file...same logic)

my %stats;

while (<DATA>) {
split /\|/;
$stats{$_[0]}++;
}

print "$_|$stats{$_}\n" foreach sort {$a cmp $b} keys %stats;

__DATA__
10/15/2000|hour|ip
10/15/2000|hour|ip
10/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip


Runs and gives the output:
10/15/2000|3
11/15/2000|4

Did I overlook something?

Installation support is provided via ICQ at UIN# 53788453. I will only respond on that number.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Oops, you are right. It should be:

Code:
my (%stats, @dates);
open (FILE, "< /path/to/file") or die "Can't open: $!";
while (<FILE>) {
chomp;
my ($date, $hr, $ip) = split /\|/;
push (@dates, $date) unless ($stats{$date});
$stats{$date}++;
}
close FILE;

foreach my $date (@dates) {
print "$date|$stats{$date}\n";
}
The reason for the @dates array is just for sorting. I don't think 'cmp' will sort the dates properly.

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
cmp should do ok with it, as that is one of the good uses of cmp (and the spaceship operator), to order strings in sequence when digits are an issue... though you're right it may not work as desired here.

K, just tested:

__DATA__
10/15/2000|hour|ip
10/15/2000|hour|ip
10/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
11/15/2000|hour|ip
01/21/2000|hour|ip
01/24/2000|hour|ip
02/18/2000|hour|ip

yielded:

01/21/2000|1
01/24/2000|1
02/18/2000|1
10/15/2000|3
11/15/2000|4

wohoo! :D

--mark

Installation support is provided via ICQ at UIN# 53788453. I will only respond on that number.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Yes, but try:

my @dates = ("02/02/2000", "02/03/2000", "01/01/2001");
print sort { $a cmp $b } @dates;

Wink

Cheers,

Alex

--
Gossamer Threads Inc.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Ahhh gotcha =)



Installation support is provided via ICQ at UIN# 53788453. I will only respond on that number.
Quote Reply
Re: Good Programmers vs. Reading logs In reply to
To add my 2 cents (CDN for what it's worth), the sort algorithm (using $a cmp $b) would work if the date was expressed as YYYY/MM/DD.


Dan Cool


Quote Reply
Re: Good Programmers vs. Reading logs In reply to
Hi!!

Thank you very much for your help.