slowpoison
slowpoison

Reputation: 616

Correct regex to weed out comments, and other unwanted lines in a configuration file

A configuration file I have, has a very simple format:

# comment
key = value

Here's what I wrote in the input loop to ignore unwanted lines before I split key-value:

while (<C>) {
    chomp;
    # ignore all comments, blank lines or other wrongly formatted lines
    # so that we are only left with key = value
    next unless /(?<!#)\s*\w+\s*=\s*\w+/;

My question: does this sufficiently cover the need to ignore unwanted lines or is there something I'm missing?

Update: My question is specifically about whether my next unless... statement covers all unwanted cases. I know there are different philosophies as for the best way to accomplish configuration parsing.

Upvotes: 0

Views: 325

Answers (3)

mpe
mpe

Reputation: 1000

I just recently did this:

while (my $line = <$fh>) {
    next if $line =~ /^\s*$/; # skip empty lines
    next if $line =~ /^\s*#/; # skip lines that are only comments

    # both sides of the '=' must contain alphanumeric characters
    if ($line != /^\s*\w+\s*=\s*\w+/) { warn "Invalid format in line $.\n" }

    # split by '=', producing a maximum of two items;
    # the value may contain whitespace
    my ($key, $value) = split '=', $line, 2;

    foreach ($key, $val) {
        s/\s+$//; # remove trailing whitespace
        s/^\s+//; # remove leading whitespace
    }

    # store in a hash or whatever you like
    $config{$key} = $value;

I like it because it allows a very flexible config file.

Upvotes: 0

yasu
yasu

Reputation: 1364

This may be what you want. Configuration will be set to hash %conf. (You don't have to use split.)

while(<C>) {
  chomp;

  # Skip comment
  next if /^#/;

  # Process configuration
  if(/^(\w+)\s*=\s*(.*)/) {
    ($key, $value) = ($1, $2);
    $conf{$key} = $value
  } else {
    print STDERR "Invalid format: $_\n";
  }
}

Upvotes: 1

the_qbf
the_qbf

Reputation: 368

you could simply do:

while (<C>) {
    chomp;

    # if you only want to skip comments ( lines starting with # )
    # then you could just specify it directly 
    next if $_ =~ m/\A#/;

}

using "unless" will complicate stuff since you're saying that "unless it is not a comment" which may mean a lot of things. Where in if you just use "if" you could specify directly that you're skipping commented lines.

Things that may be unclear to you: - $_ is the current line of the file. - \A inside the regex means start of line.

The regex matches all the lines that has a # in the beginning.

Upvotes: 0

Related Questions