user3101852
user3101852

Reputation: 25

How to replace special characters to underscore(_) perl

my @folder = ('s,c%','c__pp_p','Monday_øå_Tuesday,  Wednesday','Monday &       Tuesday','Monday_Tuesday___Wednesday');

if ($folder =~ s/[^\w_*\-]/_/g ) {
  $folder =~ s/_+/_/g;
  print "$folder : Got %\n" ; 
}

Using above code i am not able to handle this "Monday_øå_Tuesday_Wednesday"

The output should be :

s_c
c_pp_p
Monday_øå_Tuesday_Wednesday
Monday_Tuesday
Monday_Tuesday_Wednesday

Upvotes: 1

Views: 513

Answers (1)

Sobrique
Sobrique

Reputation: 53478

You can use \W to negate the \w character class, but the problem you've got is that \w doesn't match your non-ascii letters.

So you need to do something like this instead:

#!/usr/bin/env perl
use strict;
use warnings;
use Data::Dumper;

my @folder = ('s,c%','c__pp_p','Monday_øå_Tuesday,  Wednesday','Monday &       Tuesday','Monday_Tuesday___Wednesday');

s/[^\p{Alpha}]+/_/g for @folder;
print Dumper \@folder;

Outputs:

$VAR1 = [
          's_c_',
          'c_pp_p',
          'Monday_øå_Tuesday_Wednesday',
          'Monday_Tuesday',
          'Monday_Tuesday_Wednesday'
        ];

This uses a unicode property - these are documented in perldoc perluniprop - but the long and short of it is, \p{Alpha} is the unicode alphanumeric set, so much like \w but internationalised.

Although, it does have a trailing _ on the first line. From your description, that seems to be what you wanted. If not, then... it's probably easier to:

s/_$// for @folder;

than make a more complicated pattern.

Upvotes: 2

Related Questions