preahkumpii
preahkumpii

Reputation: 1301

perl search & replace script for all files in a directory

I have a directory with nearly 1,200 files. I need to successively go through each file in a perl script to search and replace any occurrences of 66 strings. So, for each file I need to run all 66 s&r's. My replace string is in Thai, so I cannot use the shell. It must be a .pl file or similar so that I can use use::utf8. I am just not familiar with how to open all files in a directory one by one to perform actions on them. Here is a sample of my s&r:

s/psa0*(\d+)/เพลงสดุดี\1/g;

Thanks for any help.

Upvotes: 2

Views: 2896

Answers (3)

preahkumpii
preahkumpii

Reputation: 1301

Just in case someone could use it in the future. This is what I actually did.

use warnings;
use strict;

use utf8;

my @files = glob ("*.html");

foreach $a (@files) {
   open IN, "$a" or die $!;
   open OUT, ">$a-" or die $!;
   binmode(IN, ":utf8");
   binmode(OUT, ":utf8");
   select (OUT);
   foreach (<IN>) {
      s/gen0*(\d+)/ปฐมกาล $1/;
      s/exo0*(\d+)/อพยพ $1/;
      s/lev0*(\d+)/เลวีนิติ $1/;
      s/num0*(\d+)/กันดารวิถี $1/;
      ...etc...
      print "$_";
   }
   close IN;
   close OUT;
};

Upvotes: 1

ikegami
ikegami

Reputation: 385556

use utf8;
use strict;
use warnings;

use File::Glob qw( bsd_glob );

@ARGV = map bsd_glob($_), @ARGV;

while (<>) {    
   s/psa0*(?=\d)/เพลงสดุดี/g;
   print;
}

perl -i.bak script.pl *

I used File::Glob's bsd_glob since glob won't handle spaces "correctly". They are actually the same function, but the function behaves differently based on how it's called.


By the way, using \1 in the replacement expression (i.e. outside a regular expression) makes no sense. \1 is a regex pattern that means "match what the first capture captured". So

s/psa0*(\d+)/เพลงสดุดี\1/g;

should be

s/psa0*(\d+)/เพลงสดุดี$1/g;

The following is a faster alternative:

s/psa0*(?=\d)/เพลงสดุดี/g;

Upvotes: 2

mob
mob

Reputation: 118595

See opendir/readdir/closedir for functions that can iterate through all the filenames in a directory (much like you would use open/readline/close to iterate through all the lines in a file).

Also see the glob function, which returns a list of filenames that match some pattern.

Upvotes: 1

Related Questions