123
123

Reputation: 11236

What is causing memory to continuously rise perl?

Problem

I have created a simple perl script to read log files and process the data asynchronously.

The problem i am facing is that the script appears to continuously use more memory the longer it runs. This seems to be affected by the amount of data it processes. The problem I have is that i am unable to identify what exactly is using all this memory, and whether it is a leak or something is just holding onto it.


Question

How can i modify the below script so that it no longer continuously consumes memory ?


Code

#Multithreaded to read multiple log files at the same time.

use strict;
use warnings;

use threads;
use Thread::Queue;
use threads::shared;

my $logq = Thread::Queue->new();
my %Servers :shared;
my %servername :shared;

sub csvsplit {
        my $line = shift;
        my $sep = (shift or ',');

        return () unless $line;

        my @cells;
        my $re = qr/(?:^|$sep)(?:"([^"]*)"|([^$sep]*))/;

        while($line =~ /$re/g) {
                my $value = defined $1 ? $1 : $2;
                push @cells, (defined $value ? $value : '');
        }

        return @cells;
}


sub process_data
{
        while(sleep(1)){

                if ($logq->pending())
                {
                        my %sites;
                        my %returns;
                        while($logq->pending() > 0){
                                my $data = $logq->dequeue();
                                my @fields = csvsplit($data);
                                $returns{$fields[$#fields - 1]}++;
                                $sites{$fields[$#fields]}++;
                        }
                print "counter:$_, value=\"$sites{$_}\" />\n" for (keys%sites);
                print "counter:$_, value=\"$returns{$_}\" />\n" for (keys%returns);
                
                }
        }

}

sub read_file
{
        my $myFile=$_[0];
        open(my $logfile,'<',$myFile) || die "error";
        my $Inode=(stat($logfile))[1];
        my $fileSize=(stat($logfile))[7];
        seek $logfile, 0, 2;
        for (;;) {
                while (<$logfile>) {
                        chomp( $_ );
                        $logq->enqueue( $_ );
                }
                sleep 5;
                if($Inode != (stat($myFile))[1] || (stat($myFile))[7] < $fileSize){
                        close($logfile);
                        while (! -e $myFile){
                                sleep 2;
                        }
                        open($logfile,'<',$myFile) || die "error";
                        $Inode=(stat($logfile))[1];
                        $fileSize=(stat($logfile))[7];
                }
                seek $logfile, 0, 1;
        }

}


my $thr1 = threads->create(\&read_file,"log");
my $thr4 = threads->create(\&process_data);
$thr1->join();
$thr4->join();

Obeservations and relevant info

The memory only seems to increase when the program has data to process, if i just leave it, it maintains the current memory usage.

Memory only appears to increase for larger throughput and increase about half a Mb per 5 seconds for around 2000 lines in the same time.

I have not included the csv as i do not think it is relevant. If you do and want me to add it please give a valid reason.


Specs

GNU bash, version 3.2.57(1)-release (s390x-ibm-linux-gnu)
perl, v5.10.0

I have looked through other questions but cannot find much of relevance. If this is a duplicate or the relevant info is in another question, feel free to mark as a dupe and ill check it out.

Any more info needed just ask.

Upvotes: 1

Views: 154

Answers (1)

nwellnhof
nwellnhof

Reputation: 33658

The reason is probably that the size of your Thread::Queue is unlimited. If the producer thread is faster than the consumer thread, your queue will continue to grow. So you should simply limit the size of your queue. For example, to set a limit of 1,000 queue items:

$logq->limit = 1000;

(The way you use the pending method is wrong by the way. You should only terminate if the return value is undefined.)

Upvotes: 2

Related Questions