Nikolay Yordanov
Nikolay Yordanov

Reputation: 1404

Search and replace regex over multiple files (large data)

I have the following piece of code that is repeated in several files:

<tr>
    <th scope="row"> (some php code) </th>
    <td>
         (more php and html)
    </td>
</tr>

There may be some whitespace before/after tr, th or td tags.

What tool and regex shall I use in order to replace it with the following:

<div class="row">
    $1
    $2
</div>

Thanks.

Upvotes: 1

Views: 673

Answers (3)

ghostdog74
ghostdog74

Reputation: 342303

you can do this will awk as well. First set record separator to </tr>, then find the opening tag <tr> as well as the search string. Let's say your search string is "more html code".

v="my new string"
awk -vRS="</tr>" -v newstring="$v" '/<tr>/ && /more html code/{ $0=newstring}{print $0>FILENAME}' file 

Another alternative to Perl, similar to your accepted answer

ruby -0777 -i.orig -pe 's/foo/bar/gs' file1 file2 file3

Upvotes: 1

Matt Ball
Matt Ball

Reputation: 359776

For the ∞th time, do not use regex to parse HTML. Use an HTML parser.

In perl, that means using a module such as Web::Scraper.

Upvotes: 4

tchrist
tchrist

Reputation: 80384

Perl has a -0777 command line option to let you read the whole thing into memory. Once you’ve done that, you can use a substitution that uses \s* for whitespace and it will cross newline boundaries. If you use ., make sure to use /s on the end of the substitution.

I can’t really tell what you want to match, but the general principle is:

perl -0777 -i.orig -pe 's/foo/bar/gs' file1 file2 file3

Upvotes: 3

Related Questions