Reputation: 16081

Search and replace hundreds of strings in tens of thousands of files?

I am looking into changing the file name of hundreds of files in a (C/C++) project that I work on. The problem is our software has tens of thousands of files that including (i.e. #include) these hundreds of files that will get changed. This looks like a maintenance nightmare. If I do this I will be stuck in Ultra-Edit for weeks, rolling hundreds of regex's by hand like so:

^\#include.*["<\\/]stupid_name.*$

with

#include <dir/new_name.h>

Such drudgery would be worse than peeling hundreds of potatoes in a sunken submarine in the antarctic with a spoon. I think it would rather be ideal to put the inputs and outputs into a table like so:

stupid_name.h <-> <dir/new_name.h>
stupid_nameb.h <-> <dir/new_nameb.h>
stupid_namec.h <-> <dir/new_namec.h>

and feed this into a regular expression engine / tool / app / etc...

My Ultimate Question: Is there a tool that will do that?

Bonus Question: Is it multi-threaded?

I looked at quite a few search and replace topics here on this website, and found lots of standard queries that asked a variant of the following question:

standard question: Replace one term in N files.

as opposed to:

my question: Replace N terms in N files.

Thanks in advance for any replies.

Upvotes: 5

Answers (7)

ghostdog74

Reputation: 342303

in *nix, (or GNU win32) , you can use GNU find and sed together... eg

find /path -type f -name "*.c" -exec  sed -i.bak 's/^\#include.*["<\\/]stupid_name.*$/#include <dir\/new_name.h>/' "{}" +;

explanation,

the find command starts finding files (-type f) starting from /path. -name "*.c" searches for all .c files, then for each one found, do a sed to change the string to the new string. -i.bak asks sed to save the original file as backup before doing inplace editing. "{}" means the file passed to sed

Upvotes: 0

Tim Pietzcker

Reputation: 336108

PowerGREP can do that. It can search for multiple search strings (literal text or regular expressions) in any combination of files, and is multithreaded (starting with PowerGREP 4, the current version).

alt text http://img682.imageshack.us/img682/5172/screen006c.png

You can save your searches for later re-use, too.

Upvotes: 0

Beta

Reputation: 99094

As Mark Wilkins says, this is a workable plan with whatever regex-handy scripting tool you prefer, but I'd suggest a couple of additional points:

Use two scripts: one to translate your list into a regexes, and another to apply them. Trying to do both jobs in one script is asking for trouble.
Don't forget to change the #include directives and rename the header files at the same time.
If you know how to change one thing in N files, then, heck, you can just loop over the K things you want to change. It's not the most efficient way in terms of processor time, but that's not the bottleneck here.
This approach will work in theory, but if it works in practice on the first try then your code base is cleaner than anything (that size) I've ever seen. There will almost certainly be little surprises: a hard-coded path that doesn't match the regex, a bad name that collides with a good name, some other glitch nobody would have thought of. I suggest starting small, with one or two pairs of names, compiling after every replacement, and retreating in case of trouble. If you do this right you can set it up to run overnight and in the morning you'll have a working code base that's almost done, and a list of the names that caused trouble and need human attention.

Upvotes: 1

SDGator

Reputation: 2077

Make a series of perl one-liners to edit the files in place, like so:

perl -i.bak -p -e 's/stupid_old_name/cool_new_name/' *.c

This has the added bonus of saving the originals of any changed files with a .bak extension.

I'd make a bunch of these, if I didn't know perl that well. I'd even put all the one-liners into a shell script, but then I'm not trying to impress any of the unix graybeards out there.

This website explains edit in place with perl very well: http://www.rice.edu/web/perl-edit.html

PS - Since I do know perl fairly well, I'd just write the was/is table in a "real" perl script and use it to open and parse all the files.

Upvotes: 1

drawnonward

Reputation: 53659

I would use awk, a command line tool similar to sed.

mv file.x file.x.bak;
awk '{
  gsub( "#include \"bad_one.h\"" , "#include \"good_one.h\"" );
  gsub( "#include \"bad_two.h\"" , "#include \"good_two.h\"" );
}' file.x.bak > file.x;

Once you are at a terminal, use man awk to see more details.

Upvotes: 2

Jeremy

Reputation: 4838

Will this (Wingrep) do the trick?

Upvotes: 0

Mark Wilkins

Reputation: 41232

I think your idea of putting the old/new names into a single location is a good one. It would certainly reduce the difficulty of maintaining and verifying the changes. It seems like this is the obvious answer, but I think that using any of the popular scripting languages such as ruby, python, perl, etc. would make this task fairly straightforward. The script could read in the file that has the old/new replacement information, construct the appropriate regular expressions from that, and then process the files that need the replacements.

The script could be written as a multi-threaded utility, although it doesn't seem like there would be a lot of benefit in this type of situation. If I understand the question, this should be basically a one-time usage so high performance does not seem like the top priority.

Upvotes: 1

Search and replace hundreds of strings in tens of thousands of files?

Answers (7)

Related Questions