England
England

Reputation: 27

How to keep whitespace formatting while reading a file into an array using perl?

I am trying to read a text file into an array of lines while keeping the whitespace formatting of each line. Once I read the entire file into an array, if I iterate through the array printing all of the lines to STDOUT, all of the whitespace formatting is lost, as well as some of the text (look at the first line). But, if I print the lines while reading the file, and printing the array using Data::Dumper, the whitespace formatting is still there.

Since I am trying to capture the output on STDOUT while iterating through the array, please tell me what I need to do to print out the properly whitespace formatted text.

The code and output are pretty much self explanatory, as can be seen below.

I dual boot between macOS Mojave 10.14.6 and Linux Mint 20 and have the same problem on both systems with different versions of perl, versions below.

Thanks in advance for any advice!!!

macOS:

macOS Mojave 10.14.6
Darwin MBPro.lan 18.7.0 Darwin Kernel Version 18.7.0: Tue Aug 20 16:57:14 PDT 2019; root:xnu-4903.271.2~2/RELEASE_X86_64 x86_64

This is perl 5, version 18, subversion 4 (v5.18.4) built for darwin-thread-multi-2level
(with 2 registered patches, see perl -V for more detail)

linux:

Linux mintMBP 5.4.0-47-generic #51-Ubuntu SMP Fri Sep 4 19:50:52 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

This is perl 5, version 30, subversion 0 (v5.30.0) built for x86_64-linux-gnu-thread-multi
(with 46 registered patches, see perl -V for more detail)

Here is the code:

#!/usr/bin/perl

use Data::Dumper;

my $filename = "formatted.txt";
my @lines;

open(my $FH, "<", $filename)
    or die "Can't open < $filename: $!";

print "\nText printed while reading file:\n\n";
while(<$FH>) {
    chomp($_);
    push(@lines, $_);
    print $_ . "\n";
}

close($FH);

print "\nText printed while iterating the line array:\n\n";
while(<@lines>) {
    print $_ . "\n";
}

print "\nText printed using Data::Dumper :\n\n";
print Dumper(@lines);

Here is the whitespace formatted input file:

<?xml version="1.0" encoding="UTF-8"?>
<dict>
        <key>Definitions</key>
        <dict>
                <key>src/main.c</key>
                <dict>
                        <key>Path</key>
                        <string>src/main.c</string>
                        <key>Group</key>
                        <array>
                                <string>src</string>
                        </array>
                </dict>
        </dict>
</dict>

Here is the output:

[MBPro:Development/tmp/perl_formatted_lines] justin% ./file_test.pl

Text printed while reading file:

<?xml version="1.0" encoding="UTF-8"?>
<dict>
    <key>Definitions</key>
    <dict>
        <key>src/main.c</key>
        <dict>
            <key>Path</key>
            <string>src/main.c</string>
            <key>Group</key>
            <array>
                <string>src</string>
            </array>
        </dict>
    </dict>
</dict>

Text printed while iterating the line array:

version=1.0
<dict>
<key>Definitions</key>
<dict>
<key>src/main.c</key>
<dict>
<key>Path</key>
<string>src/main.c</string>
<key>Group</key>
<array>
<string>src</string>
</array>
</dict>
</dict>
</dict>

Text printed using Data::Dumper :

$VAR1 = '<?xml version="1.0" encoding="UTF-8"?>';
$VAR2 = '<dict>';
$VAR3 = '   <key>Definitions</key>';
$VAR4 = '   <dict>';
$VAR5 = '       <key>src/main.c</key>';
$VAR6 = '       <dict>';
$VAR7 = '           <key>Path</key>';
$VAR8 = '           <string>src/main.c</string>';
$VAR9 = '           <key>Group</key>';
$VAR10 = '          <array>';
$VAR11 = '              <string>src</string>';
$VAR12 = '          </array>';
$VAR13 = '      </dict>';
$VAR14 = '  </dict>';
$VAR15 = '</dict>';

Upvotes: 0

Views: 208

Answers (2)

Polar Bear
Polar Bear

Reputation: 6808

Corrected OP's demo code to read lines into an array @lines

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $fname = 'formatted.txt';
my @lines;

open my $fh, '<', $fname
    or die "Couldn't open $fname";
    
@lines = <$fh>;

close $fh;

chomp(@lines);

say Dumper(\@lines);
say for @lines;

Input

<?xml version="1.0" encoding="UTF-8"?>
<dict>
        <key>Definitions</key>
        <dict>
                <key>src/main.c</key>
                <dict>
                        <key>Path</key>
                        <string>src/main.c</string>
                        <key>Group</key>
                        <array>
                                <string>src</string>
                        </array>
                </dict>
        </dict>
</dict>

Output

$VAR1 = [
          '<?xml version="1.0" encoding="UTF-8"?>',
          '<dict>',
          '        <key>Definitions</key>',
          '        <dict>',
          '                <key>src/main.c</key>',
          '                <dict>',
          '                        <key>Path</key>',
          '                        <string>src/main.c</string>',
          '                        <key>Group</key>',
          '                        <array>',
          '                                <string>src</string>',
          '                        </array>',
          '                </dict>',
          '        </dict>',
          '</dict>'
        ];

<?xml version="1.0" encoding="UTF-8"?>
<dict>
        <key>Definitions</key>
        <dict>
                <key>src/main.c</key>
                <dict>
                        <key>Path</key>
                        <string>src/main.c</string>
                        <key>Group</key>
                        <array>
                                <string>src</string>
                        </array>
                </dict>
        </dict>
</dict>

Upvotes: -1

choroba
choroba

Reputation: 241988

Don't use while (<@lines>), it doesn't do what you think.*

Instead, use a for loop:

for (@lines) {
    print $_, "\n";
}

*) It calls glob, i.e. it joins the contents of the array with $" and interprets the result as a glob expression, i.e. whitespace separated wildcard patterns.

Upvotes: 6

Related Questions