fearless_fool
fearless_fool

Reputation: 35159

Substitute multiple strings from a file over multiple files

Let's say I'm George Orwell and I want to replace all instances of "big" with "small", "rich" with "poor", "smart" with "stupid", etc across a bunch of files. So I create a text file with one line per substitution:

file: substs.csv

big, small
rich, poor
smart, stupid

Now I want to apply those substitutions in substs.csv globally across a bunch of files. I assume this would use a sed script. Note that I'm happy to format substs.csv to have any format, as long as its one substitution pair per line.

What's the right tool, and what's the script that will do this?

Edit 1: It's fine to operate on just one file at a time. I can do foreach or equivalent...

Edit 2: I can guarantee that substitutions on the right hand side don't appear on the left hand side, i.e., order of operation won't matter.

[I'm tempted to just bust out python and do it there. But this is a chance to refresh my unix tools chops...]

Upvotes: 0

Views: 199

Answers (2)

potong
potong

Reputation: 58371

This might work for you (GNU sed):

sed -En 's#(\S+), (\S+)#s/\\<\1\\>/\2/gp#' csvFile | sed -f - txtFile

Convert the csv file into a sed file and apply it to the text file.

N.B. The start/end word boundaries in the manufactured regexp.

Upvotes: 0

tink
tink

Reputation: 15206

As Kamil said in the comments, there's probably a million different ways to stroke that cat ...

One that sprang into my warped mind was:

find -type f -name "*txt" -exec $(awk -F", *" 'BEGIN{printf "sed -i.bk "}{printf "-e s/%s/%s/g ", $1,$2}END{printf "\n"}' substs.csv) {} \;

Basically I'm building the sed-command on the fly (using your substs.csv and awk) and then use that via find to modify any files that end in .txt. Your selection criteria may be wider, you may not want backups of the files (et rid of the .bk in "sed -i.bk ") ... but it does what you're trying to achieve.

Upvotes: 1

Related Questions