thomaslee
thomaslee

Reputation: 407

Puzzled about reading from files in Perl

When I want to read from a file, I use the following codes:

open my $fh, "file_path";
while(<$fh>)
{
    # do something here
}

But I'm worry about the IO performance if the file is very large, since it read disk line by line.

In C program, we may prefer to read several bytes (for example: 4096 bytes) into memory at a time ('fread' function called). That can reduce the number of disk IO operations, and benefits performance of the program.

So my question is: Is there any way to read multiple lines or several bytes from file at a time in Perl ? Or Perl will encapsulate the IO details, and I do not need to worry about that ?

Upvotes: 0

Views: 207

Answers (3)

shawnhcorey
shawnhcorey

Reputation: 3601

By default, all I/O operations in Perl are buffered. Buffering maximize the throughput of the I/O operations. Don't mess with it unless your require special processing because changing the default will slow down your program.

Upvotes: 2

amon
amon

Reputation: 57600

When you open a file with open, then there are various layers between the file handle and the physical resource. These layers can be buffering (can be switched off for output filehandles), or PerlIO-layers.

PerlIO-layers can do encoding or line ending transformation. E.g. to open an UTF-8-encoded file, we would

use autodie; # throw fatal exception when open fails
open my $fh, "<:utf8", $filename;

Layers can be specified on open with :layer after the open mode. The :uft8-layer is a shortcut for :encoding(uft8). Layers can also be added via binmode.

If you want to read a fixed length string from such a filehandle into a buffer, you can use the very C-ish read.

If you want unbuffered access to a file, you can use sysopen/sysread, but it is very unlikely that you have a situation where this is beneficial. You can't use layers in this case, which sucks for text data.

If you just want to read a whole file at once, you should use the File::Slurp module, which is optimized for this usecase. But for regular, line-by-line processing, the readline operator <$fh> is good enough.

Upvotes: 3

Qiau
Qiau

Reputation: 6175

Yes, by default IO is buffered, see open and perliol.

You could use sysopen & sysread if you want access to the C-style functions.

Upvotes: 5

Related Questions