Reputation: 311
I have a server log file from which I am trying to create a PHP page which summaries the data it stores. Each record in the log is stored on a new line, and in the format:
207.3.35.52 -- [2007-04-01 01:24:42] "GET index.php HTTP/1.0" 200 11411 "Mozilla/4.0"
//ip -- [timestamp] "GET url HTTP/1.0" status code bytes "user agent".
I am trying to write a summary which displays: the total amount of requests, the total amount of requests form the articles directory, the total bandwidth consumed and finally the amount of 404 errors and their pages.
PHP:
$handle = fopen('logfiles/april.log','r') or die ('File opening failed');
$requestsCount = 0;
while (!feof($handle)) {
$dd = fgets($handle);
$requestsCount++;
$parts = explode('"', $dd);
$statusCode = substr($parts[2], 0, 4);
}
fclose($handle);
This code opens the file and counts the amount of records, seperates and finds the status code number in the record. When echoing $statusCode it does show the correct information, showing all the status codes in the log.
A function which accepts two arguements to total the 404 errors:
function requests404($l,$s) {
$r = substr_count($l,$s);
return "Total 404 errors: ".$r."<br />";
}
Echo the result:
echo requests404($statusCode, '404');
This function doesn't work, it just returns 0. Working with txt files in PHP is my weakest skill and I would really appreciate some help as I think I may be going about it the complete wrong way. Thanks.
Upvotes: 3
Views: 19790
Reputation: 5931
Although I love using PHP for many things.. Parsing logs just isn't one of them.
Instead I'd really urge you look into using awk for all your future log parsing endeavors.
Here is the simple bash/awk script I through together which implements all your requirements in a very easy to read / easy to understand manner:
#!/bin/bash
awk '
BEGIN {
total_requests = 0;
total_requests_articles = 0;
total_404s = 0;
total_bandwidth = 0;
} {
total_requests++;
if ( $8 == "404" ) {
total_404s++;
}
if ( $6 ~ /articles/ ) {
total_requests_articles++;
}
total_bandwidth += $9
} END {
printf "total requests: %i\n", total_requests
printf "total requests for articles: %i\n", total_requests_articles
printf "total 404s: %i\n", total_404s
printf "total bandwidth used: %i\n", total_bandwidth
}' ${1}
Using this file as a demo:
207.3.35.52 -- [2007-04-01 01:24:42] "GET index.php HTTP/1.0" 200 11411 "Mozilla/4.0"
207.3.35.52 -- [2007-04-01 01:24:42] "GET index.php HTTP/1.0" 200 11411 "Mozilla/4.0"
207.3.35.52 -- [2007-04-01 01:24:42] "GET index.php HTTP/1.0" 200 11411 "Mozilla/4.0"
207.3.35.52 -- [2007-04-01 01:24:42] "GET articles/index.php HTTP/1.0" 404 11411 "Mozilla/4.0"
207.3.35.52 -- [2007-04-01 01:24:42] "GET articles/index.php HTTP/1.0" 200 11411 "Mozilla/4.0"
207.3.35.52 -- [2007-04-01 01:24:42] "GET index.php HTTP/1.0" 404 11411 "Mozilla/4.0"
Here's what the results look like:
[root@hacklab5 tmp]# ./apache.bash apache.log
total requests: 6
total requests for articles: 2
total 404s: 2
total bandwidth used: 68466
Just to say.. Awk is awesome. And blazing fast. And tailored for parsing logs. Now, learn you some awk for great good ;)
Cheers --
Upvotes: 3
Reputation: 57388
substr_count
will sum the number of occurrences of "404" within $statusCode
, and $statusCode
is, each time, only the four bytes " 200" (or " 304" or " 404") of a single line of the logs.
So whenever the status code is not 404, you will get zero, which is correct.
You need to call requests404
on each line of input, and sum the total.
Actually it would probably be better to use an array:
$totals = array(
200 => 0,
404 => 0,
304 => 0,
);
$requestsCount = 0;
$bytesSent = 0;
$totalBytes = 0;
while (!feof($handle)) {
$dd = fgets($handle);
$requestsCount++;
$parts = explode('"', $dd);
list($statusCode, $bytes) = explode(" ", $parts[2]);
if (!isset($totals[$statusCode]))
$totals[$statusCode] = 0;
$totals[$statusCode]++;
if (200 == $statusCode)
$bytesSent += $bytes;
$totalBytes += $bytes;
}
fclose($handle);
printf("We got $totals[404] 404 errors\n");
At the end of the loop, $totals will hold something like
{
200 => 12345,
404 => 1234,
401 => 22,
304 => 7890,
...
}
Upvotes: 0
Reputation: 411
$handle = fopen('logfiles/april.log','r') or die ('File opening failed');
$requestsCount = 0;
$num404 = 0;
while (!feof($handle)) {
$dd = fgets($handle);
$requestsCount++;
$parts = explode('"', $dd);
$statusCode = substr($parts[2], 0, 4);
if (hasRequestType($statusCode, '404')) $num404++;
}
echo "Total 404 Requests: " . $num404 . "<br />";
fclose($handle);
function hasRequestType($l,$s) {
return substr_count($l,$s) > 0;
}
Upvotes: 5