Reputation: 19568
Let's say I have a binary file that is formatted like
[unsigned int(length of text)][text][unsigned int(length of text)][text][unsigned int(length of text)][text]
And that pattern for the file just keeps repeating. How do I read the unsigned int and print it out followed by the text block in Perl?
Again, this is a binary file and not a plain text file.
Upvotes: 3
Views: 7151
Reputation: 27183
In addition to using unpack, as RC points out, you will almost certainly want to use read or sysread to read data from the file.
Upvotes: 0
Reputation: 28237
You'll need to use the unpack function on the data. Check out Pack/Unpack Tutorial (aka How the System Stores Data).
This should get you headed in the right direction (assuming 32 bit):
#!/usr/bin/perl
use strict;
my $strBuf = "perl rocks";
my $packed = pack("I Z15", length($strBuf), $strBuf);
{
open(my $binFile, '>', "test.bin") || die("Error opening file\n");
binmode $binFile;
print $binFile $packed;
close $binFile;
}
open(my $binFile, '<', "test.bin") || die("Error opening file\n");
binmode $binFile;
my $buffer;
read($binFile, $buffer, 4); ## Read out unsigned int binary data
my $length = unpack("I", $buffer); ## Unpack the data
read($binFile, $buffer, $length); ## Read the length out as binary
my $string = unpack("Z$length", $buffer); ## Unpack the string data in buffer
print "Len: $length String: $string\n";
exit;
Upvotes: 1
Reputation: 41624
Here is a small working example.
#!/usr/bin/perl
use strict;
use warnings;
my $INT_SIZE = 2;
my $filename = 'somefile.bin';
open my $fh, '<', $filename or die "Couldn't open file $filename: $!\n";
binmode $fh;
while ( read $fh, my $packed_length, $INT_SIZE ) {
my $text = '';
my $length = unpack 'v', $packed_length;
read $fh, $text, $length;
print $length, "\t", $text, "\n";
}
Change INT_SIZE and the size and endianness of the unpack template to suit (either 'v' or 'n' or 'V' or 'N'). See the unpack manpage for more details.
Upvotes: 2
Reputation: 2340
There is not really enough information here to solve this problem completely.
What is needed is the exact format of the length field and of the text field. Is the int 2 bytes, 4 bytes or 8 bytes? (All are possible.) Also is it little-endian or big-endian?
Given this information, you then access the first integer using the read function, and convert it to a number using bit operations or the unpack function.
The next issue is the exact format of the text string. Is it ASCII, EBCDIC or a UTF format? Knowing this you can calculate the length of the string and use one or more read operations to obtain the raw string which you may have to convert into a more manageable form.
One other thing -- you'll need to open the file in binary mode otherwise you may not obtain the results expected.
Upvotes: 0