Thrinath Dosapati
Thrinath Dosapati

Reputation: 195

How to read multi-line values from a file using Perl

I have a properties file, say

##
## Start of property1
##
##
Property1=\
a:b,\
a1:b1,\
a2,b2
##
## Start of propert2
##
Property2=\
c:d,\
c1:d1,\
c2,d2

Note that the value for any given property may be split across multiple lines.

I want to read this property file using Perl. This works fine in Java, as Java supports multi-line values using the backslash, but in Perl it is a nightmare.

In the above properties file there are two properties - Property1 and Property2 - each associated with a string which I can split based on the delimiters , and :

For a given property (say Property1) and given column (say a1) I need to return second column (here b1)

The code should be able to ignore comments, spaces, etc.

Thanks in Advance

Upvotes: 3

Views: 2962

Answers (2)

Borodin
Borodin

Reputation: 126762

Most text processing - including handling backslash continuation lines - is very simple in Perl. All you need is a read loop like this.

while (<>) {
  $_ .= <> while s/\\\n// and not eof;
}

The program below does what I think you want. I have put a print call in the read loop to show the complete records that have been aggregated over continuation lines. I have also demonstrated extracting the b1 field that you gave as an example, and shown the output from Data::Dump so that you can see the data structure that is created.

use strict;
use warnings;

my %data;

while (<DATA>) {
  next if /^#/;
  $_ .= <DATA> while s/\\\n// and not eof;
  print;
  chomp;
  my ($key, $values) = split /=/;
  my @values = map [ split /:/ ], split /,/, $values;
  $data{$key} = \@values;
}

print $data{Property1}[1][1], "\n\n";

use Data::Dump;
dd \%data;


__DATA__
##
## Start of property1
##
##
Property1=\
a:b,\
a1:b1,\
a2,b2
##
## Start of propert2
##
Property2=\
c:d,\
c1:d1,\
c2,d2

output

Property1=a:b,a1:b1,a2,b2
Property2=c:d,c1:d1,c2,d2
b1

{
  Property1 => [["a", "b"], ["a1", "b1"], ["a2"], ["b2"]],
  Property2 => [["c", "d"], ["c1", "d1"], ["c2"], ["d2"]],
}

Update

I have read your question again and I think you may prefer a different representation of your data. This variant keeps the proerty values as hashes instead of arrays of arrays, otherwise its behaviour is identical

use strict;
use warnings;

my %data;

while (<DATA>) {
  next if /^#/;
  $_ .= <DATA> while s/\\\n// and not eof;
  print;
  chomp;
  my ($key, $values) = split /=/;
  my %values = map { my @kv = split /:/; @kv[0,1] } split /,/, $values;
  $data{$key} = \%values;
}

print $data{Property1}{a1}, "\n\n";

use Data::Dump;
dd \%data;

output

Property1=a:b,a1:b1,a2,b2
Property2=c:d,c1:d1,c2,d2
b1

{
  Property1 => { a => "b", a1 => "b1", a2 => undef, b2 => undef },
  Property2 => { c => "d", c1 => "d1", c2 => undef, d2 => undef },
}

Upvotes: 5

dan1111
dan1111

Reputation: 6566

Assuming your file isn't too large, here is a simple approach:

use strict;
use warnings;

open FILE, "my_file.txt" or die "Can't open file!";

{
    local $/;
    my $file = <FILE>;
    #If \ is found at the end of the line, delete the following line break.
    $file =~ s/\\\n//gs;
}

Any time a line ends with \, the following line break is removed. This will put each multi-line property on a single line.

The downside is that this reads the entire file into memory; you could adapt it to an algorithm that goes through the file line by line, if your input file is very large.

Upvotes: 0

Related Questions