bart2puck
bart2puck

Reputation: 2522

perl search and replace a substring

I am trying to search for a substring and replace the whole string if the substring is found. in the below example someVal could be any value that is unknown to me.

how i can search for someServer.com and replace the whole string $oldUrl and with $newUrl?

I can do it on the whole string just fine:

$directory = "/var/tftpboot";

my $oldUrl = "someVal.someServer.com";
my $newUrl = "someNewVal.someNewServer.com";

opendir( DIR, $directory ) or die $!;
while ( my $files = readdir(DIR) ) {
    next unless ( $files =~ m/\.cfg$/ );
    open my $in,  "<", "$directory/$files";
    open my $out, ">", "$directory/temp.txt";
    while (<$in>) {
        s/.*$oldUrl.*/$newUrl/;
        print $out $_;
    }
    rename "$directory/temp.txt", "$directory/$files";
}

Upvotes: 1

Views: 2089

Answers (3)

Miller
Miller

Reputation: 35198

If you want to match and replace any subdomain, then you should devise a specific regular expression to match them.

\b(?i:(?!-)[a-z0-9-]+\.)*someServer\.com

The following is a rewrite of your script using more Modern Perl techniques, including Path::Class to handle file and directory operations in a cross platform way and $INPLACE_EDIT to automatically handle the editing of a file.

use strict;
use warnings;
use autodie;

use Path::Class;

my $dir = dir("/var/tftpboot");

while (my $file = $dir->next) {
    next unless $file =~ m/\.cfg$/;

    local @ARGV = "$file";
    local $^I = '.bak';
    while (<>) {
        s/\b(?i:(?!-)[a-z0-9-]+\.)*someServer\.com\b/someNewVal.someNewServer.com/;
        print;
    }
    #unlink "$file$^I"; # Optionally delete backup
}

Upvotes: 1

TLP
TLP

Reputation: 67900

Your script will delete much of your content because you are surrounding the match with .*. This will match any character except newline, as many times as it can, from start to end of each line, and replace it.

The functionality that you are after already exists in Perl, the use of the -pi command line switches, so it would be a good idea to make use of it rather than trying to make your own, which works exactly the same way. You do not need a one-liner to use the in-place edit. You can do this:

perl -pi script.pl *.cfg

The script should contain the name definitions and substitutions, and any error checking you need.

my $old = "someVal.someServer.com";
my $new = "someNewVal.someNewServer.com";

s/\Q$old\E/$new/g;

This is the simplest possible solution, when running with the -pi switches, as I showed above. The \Q ... \E is the quotemeta escape, which escapes meta characters in your string (highly recommended).

You might want to prevent partial matches. If you are matching foo.bar, you may not want to match foo.bar.baz, or snafoo.bar. To prevent partial matching, you can put in anchors of different kinds.

  • (?<!\S) -- do not allow any non-whitespace before match
  • \b -- match word boundary

Word boundary would be suitable if you want to replace server1.foo.bar in the above example, but not snafoo.bar. Otherwise use whitespace boundary. The reason we do a double negation with a negative lookaround assertion and negated character class is to allow beginning and end of line matches.

So, to sum up, I would do:

use strict;
use warnings;

my $old = "someVal.someServer.com";
my $new = "someNewVal.someNewServer.com";

s/(?<!\S)\Q$old\E(?!\S)/$new/g;

And run it with

perl -pi script.pl *.cfg

If you want to try it out beforehand (highly recommended!), just remove the -i switch, which will make the script print to standard output (your terminal) instead. You can then run a diff on the files to inspect the difference. E.g.:

$ perl -p script.pl test.cfg > test_replaced.cfg
$ diff test.cfg test_replaced.cfg

You will have to decide whether word boundary is more desirable, in which case you replace the lookaround assertions with \b.

Always use

use strict;
use warnings;

Even in small scripts like this. It will save you time and headaches.

Upvotes: 2

choroba
choroba

Reputation: 241858

Watch for the Dot-Star: it matches everything that surrounds the old URL, so the only thing remaining on the line will be the new URL:

s/.*$oldUrl.*/$newUrl/; 

Better:

s/$oldUrl/$newUrl/;

Also, you might need to close the output file before you try to rename it.

If the old URL contains special characters (dots, asterisks, dollar signs...) you might need to use \Q$oldUrl to suppress their special meaning in the regex pattern.

Upvotes: 0

Related Questions