kyy
kyy

Reputation: 321

Perl regex to subsitute a pattern excluding another pattern

I have a string as below.

$line = 'this is my string "hello world"';

I want to have a regex to delete all space characters inside the string except the region "Hello world".

I use below to delete space chars but it deletes all of them.

$line=~s/ +//g;

How can I exclude the region between "Hello world" and i get the string as below?

thisismystring"hello world"

Thanks

Upvotes: 1

Views: 147

Answers (5)

Ωmega
Ωmega

Reputation: 43673

s/\s+(?=(?:[^"]*"[^"]*")*[^"]*$)//g

Test the code here.

Upvotes: 0

perreal
perreal

Reputation: 97938

#!/usr/bin/perl
use warnings;
use strict;

sub main {
  my $line = 'this is my string "hello world"';
  while ($line =~ /(\w*|(?:"[^"]*"))\s*/g) { print $1;}
  print "\n";
}

main;

Upvotes: 0

fardjad
fardjad

Reputation: 20394

Another regex to do it:

s/(\s+(".*?")?)/$2/g

Upvotes: 0

raina77ow
raina77ow

Reputation: 106385

Well, here's one way to do it:

use warnings;
use strict;

my $l = 'this is my string "hello world some" one two three "some hello word"';
$l =~ s/ +(?=[^"]*(?:"[^"]*"[^"]*)+$)//g;

print $l;
# thisismystring"hello world some"onetwothree"some hello word"

Demo.

But I really wonder shouldn't it be done the other way (by tokenizing the string, for example), especially if the quotes may be unbalanced.

Upvotes: 1

Stefan Majewsky
Stefan Majewsky

Reputation: 5555

Since you probably want to handle quoted strings properly, you should have a look at the Text::Balanced module.

Use that to split your text into quoted parts and non-quoted parts, then do the replacement on the non-quoted parts only, and finally join the string together again.

Upvotes: 4

Related Questions