hexacyanide
hexacyanide

Reputation: 91619

PHP: Retrieving lines from the end of a large text file

I've searched for an answer for quite a while, and haven't found anything that works correctly.

I have log files, some reaching 100MB in size, around 140,000 lines of text. With PHP, I am trying to get the last 500 lines of the file.

How would I get the 500 lines? With most functions, the file is read into memory, and that isn't a plausible case for this matter. I would preferably stay away from executing system commands.

Upvotes: 3

Views: 3496

Answers (3)

Paul
Paul

Reputation: 141839

I wrote this function which seems to work quite nicely to me. It returns an array of lines just like file. If you want it to return a string like file_get_contents, then just change the return statement to return implode('', array_reverse($lines));:

function file_get_tail($filename, $num_lines = 10){

    $file = fopen($filename, "r");

    fseek($file, -1, SEEK_END);

    for ($line = 0, $lines = array(); $line < $num_lines && false !== ($char = fgetc($file));) {
        if($char === "\n"){
            if(isset($lines[$line])){
                $lines[$line][] = $char;
                $lines[$line] = implode('', array_reverse($lines[$line]));
                $line++;
            }
        }else
            $lines[$line][] = $char;
        fseek($file, -2, SEEK_CUR);
    }
    fclose($file);

    if($line < $num_lines)
        $lines[$line] = implode('', array_reverse($lines[$line]));

    return array_reverse($lines);
}

Example:

file_get_tail('filename.txt', 500);

Upvotes: 4

Matthew
Matthew

Reputation: 48284

If you want to do it in PHP:

<?php
/**
  Read last N lines from file.

  @param $filename string  path to file. must support seeking
  @param $n        int     number of lines to get.

  @return array            up to $n lines of text
*/
function tail($filename, $n)
{
  $buffer_size = 1024;

  $fp = fopen($filename, 'r');
  if (!$fp) return array();

  fseek($fp, 0, SEEK_END);
  $pos = ftell($fp);

  $input = '';
  $line_count = 0;

  while ($line_count < $n + 1)
  {
    // read the previous block of input
    $read_size = $pos >= $buffer_size ? $buffer_size : $pos;
    fseek($fp, $pos - $read_size, SEEK_SET);

    // prepend the current block, and count the new lines
    $input = fread($fp, $read_size).$input;
    $line_count = substr_count(ltrim($input), "\n");

    // if $pos is == 0 we are at start of file
    $pos -= $read_size;
    if (!$pos) break;
  }

  fclose($fp);

  // return the last 50 lines found  

  return array_slice(explode("\n", rtrim($input)), -$n);
}

var_dump(tail('/var/log/syslog', 50));

This is largely untested, but should be enough for you to get a fully working solution.

The buffer size is 1024, but can be changed to be bigger or larger. (You could even dynamically set it based on $n * estimate of line length.) This should be better than seeking character by character, although it does mean we need to do substr_count() to look for new lines.

Upvotes: 4

Chris Trahey
Chris Trahey

Reputation: 18290

If you are on a 'nix machine, you should be able to use shell escaping and the tool 'tail'. It's been a while, but something like this:

$lastLines = `tail -n 500`;

notice the use of tick marks, which executes the string in BASH or similar and returns the results.

Upvotes: 6

Related Questions