user105033
user105033

Reputation: 19568

How can I read an unsigned int from a binary file in Perl?

Let's say I have a binary file that is formatted like

    [unsigned int(length of text)][text][unsigned int(length of text)][text][unsigned int(length of text)][text]

And that pattern for the file just keeps repeating. How do I read the unsigned int and print it out followed by the text block in Perl?

Again, this is a binary file and not a plain text file.

Upvotes: 3

Views: 7151

Answers (4)

daotoad
daotoad

Reputation: 27183

In addition to using unpack, as RC points out, you will almost certainly want to use read or sysread to read data from the file.

Upvotes: 0

RC.
RC.

Reputation: 28237

You'll need to use the unpack function on the data. Check out Pack/Unpack Tutorial (aka How the System Stores Data).

This should get you headed in the right direction (assuming 32 bit):

#!/usr/bin/perl

use strict;

my $strBuf = "perl rocks";
my $packed = pack("I Z15", length($strBuf), $strBuf);
{
    open(my $binFile, '>', "test.bin") || die("Error opening file\n");
    binmode $binFile;
    print $binFile $packed;
    close $binFile;
}


open(my $binFile, '<', "test.bin") || die("Error opening file\n");
binmode $binFile;

my $buffer;
read($binFile, $buffer, 4);  ## Read out unsigned int binary data
my $length    = unpack("I", $buffer);  ## Unpack the data

read($binFile, $buffer, $length);  ## Read the length out as binary
my $string = unpack("Z$length", $buffer);   ## Unpack the string data in buffer

print "Len: $length  String: $string\n";
exit;

Upvotes: 1

jmcnamara
jmcnamara

Reputation: 41624

Here is a small working example.

#!/usr/bin/perl

use strict;
use warnings;

my $INT_SIZE = 2;
my $filename = 'somefile.bin';

open my $fh, '<', $filename or die "Couldn't open file $filename: $!\n";

binmode $fh;

while ( read $fh, my $packed_length, $INT_SIZE ) {

    my $text = '';
    my $length = unpack 'v', $packed_length;

    read $fh, $text, $length;

    print $length, "\t", $text, "\n";
}

Change INT_SIZE and the size and endianness of the unpack template to suit (either 'v' or 'n' or 'V' or 'N'). See the unpack manpage for more details.

Upvotes: 2

David Harris
David Harris

Reputation: 2340

There is not really enough information here to solve this problem completely.

What is needed is the exact format of the length field and of the text field. Is the int 2 bytes, 4 bytes or 8 bytes? (All are possible.) Also is it little-endian or big-endian?

Given this information, you then access the first integer using the read function, and convert it to a number using bit operations or the unpack function.

The next issue is the exact format of the text string. Is it ASCII, EBCDIC or a UTF format? Knowing this you can calculate the length of the string and use one or more read operations to obtain the raw string which you may have to convert into a more manageable form.

One other thing -- you'll need to open the file in binary mode otherwise you may not obtain the results expected.

Upvotes: 0

Related Questions