francis
francis

Reputation: 6359

In Perl, how can I remove all spaces that are not inside double quotes " "?

I'm tying to come up with some regex that will remove all space chars from a string as long as it's not inside of double quotes (").

Example string:

some string with "text in quotes"

Result:

somestringwith"text in quotes"

So far I've come up with something like this:

    $str =~ /"[^"]+"|/g;

But it doesn't seem to be giving the intended result.

I'm honestly very new at perl and haven't had too much regexp experience. So if anyone willing to answer would also be willing to provide some insight into the why and how that would be great!

Thanks!

EDIT

String will not contain escaped "'s

It should actually always be formatted like this:

Some.String = "Some Value"

Result would be

Some.String="Some Value"

Upvotes: 3

Views: 2090

Answers (6)

Michael Slade
Michael Slade

Reputation: 13877

It can be done with regex:

s/([^ ]*|\"[^\"]*\") */$1/g

Note that this won't handle any kind of escapes inside the quotes.

Upvotes: 0

choroba
choroba

Reputation: 241958

Splitting on double quotes, removing spaces only from even fields (i.e. those in quotes):

sub remove_spaces {
    my $string = shift;
    my @fields = split /"/, $string . ' '; # trailing space needed to keep final " in output
    my $flag = 1;
    return join '"', map { s/ +//g if $flag; $flag = ! $flag; $_} @fields;
}

Upvotes: 0

Sinan Ünür
Sinan Ünür

Reputation: 118148

Text::ParseWords is tailor-made for this:

#!/usr/bin/env perl

use strict;
use warnings;
use Text::ParseWords;

my @strings = (
    q{This.string = "Hello World"},
    q{That " string " and "another   shoutout to my   bytes"},
);

for my $s ( @strings ) {
    my @words = quotewords '\s+', 1, $s;
    print join('', @words), "\n";
}

Output:

This.string="Hello World"
That" string "and"another   shoutout to my   bytes"

Using Text::ParseWords means if you ever had to deal with quoted strings with escaped quotation marks in them, you'd be ready ;-)

Also, this sounds like you have a configuration file of some sort and you're trying to parse it. If that is the case, there are probably better solutions.

Upvotes: 3

Borodin
Borodin

Reputation: 126742

I suggest removing the quoted substrings using split and then recombining them with join after removing whitespace from the intermediate text.

Note that if the regex used for split contains captures then the captured values will also be included in the list returned.

Here's some sample code.

use strict;
use warnings;

my $source = <<END;
Some.String = "Some Value";
Other.String = "Other Value";
Last.String = "Last Value";
END

print join '', map {s/\s+// unless /"/; $_; } split /("[^"]*")/, $source;

output

Some.String= "Some Value";Other.String = "Other Value";Last.String = "Last Value";

Upvotes: 1

KIC
KIC

Reputation: 6121

I would simply loop through the string char by char. This way you can handle escaped strings too (just add an isEscaped variable).

my $text='lala "some thing with quotes " lala ... ';
my $quoteOpen = 0;
my $out;

foreach $char(split//,$text) {
  if ($char eq "\"" && $quoteOpen==0) {
    $quoteOpen = 1;
    $out .= $char;
  } elsif ($char eq "\"" && $quoteOpen==1) {
    $quoteOpen = 0;
    $out .= $char;
  } elsif ($char =~ /\s/ && $quoteOpen==1) {
    $out .= $char;
  } elsif ($char !~ /\s/) {
    $out .= $char;
  }
}

print "$out\n";

Upvotes: 0

TLP
TLP

Reputation: 67900

Here is a technique using split to separate the quoted strings. It relies on your data being consistent and will not work with loose quotes.

use strict;
use warnings;

my @line = split /("[^"]*")/;
for (@line) {
    unless (/^"/) {
        s/[ \t]+//g;
    }
}
print @line;  # line is altered 

Basically, you split up the string in order to isolate the quoted strings. Once that is done, perform the substitution on all other strings. Since the array elements are aliased in the loop, substitutions are performed on the actual array.

You can run this script like so:

perl -n script.pl inputfile

To see the output. Or

perl -n -i.bak script.pl inputfile

To do in-place edit on inputfile, while saving backup in inputfile.bak.

With that said, I'm not sure what your edit means. Do you want to change

Some.String = "Some Value"

to

Some.String="Some Value"

Upvotes: 5

Related Questions