michael huser
michael huser

Reputation: 11

How to Optimize perl code for Load testing using thread or parallel

Hi, for basic load testing, I have prepare a below Perl code through that am pushing around 10000 file into my system. However, I'm having trouble getting the performance where I want it. Like I said, I don't care if it uses 100% CPU. My target is push 10000 file in 1 sec. Is there any better way to write this script (with the help of thread or parallel) in Perl.

#!/usr/bin/perl
my $directory= "/home/Documents/File";
chdir $directory;
opendir(DIR, ".") or die "couldn't open $directory: $!\n";
foreach my $file (readdir DIR){
  my $cmd = "ft -MI -NMM -P 500 -f $file -d.";
  system ( "cat","$cmd");
  close $in_fh;
}
close DIR;

Upvotes: 1

Views: 88

Answers (1)

Sobrique
Sobrique

Reputation: 53498

You're operating under a misconception. What parallel code does is allow you to use multiple CPUs concurrently. This means for CPU intensive workloads, you get performance increases - the more decoupled the task, the better it scales.

However your task is reading a filesystem. It looks like you're not doing anything more complicated than a directory traversal and read.

Your limiting factor in doing this will almost certainly be your disk subsystem, and so parallelism won't help you in the slightest. Indeed, it might make things worse - because most disk controllers can detect sequential access patterns and prefetch, but if you're pseudo-randoming it by parallelising ... it can't do so as efficiently.

So - short answer is don't bother, because you won't gain much.

You might want to consider not making a system call to run cat, and just use perl's open which'll speed you up a little bit. Probably.

You should also ALWAYS use strict; and use warnings; - especially before posting to Stack Overflow - because they'll help you spot some of the more obvious error cases.

Like for example:

Global symbol "$in_fh" requires explicit package name at file.pl line 10.

Before even thinking about parallelism, you need to sort out basic errors. Parallel code is quite cool, but if it's also a monstrous nightmare to debug if your code is shoddy in the first place.

Upvotes: 3

Related Questions