Florin C.
Florin C.

Reputation: 366

Find & replace multiple keywords defined string

I'm trying to remove the following string/line in my SQL database:

<p><span style="font-size:16px"><strong>The quick brown &nbsp;</strong></span><strong><span style="font-size:16px">fox jumps.</span></strong></p>
  1. String will always start with <p> and end with </p>
  2. String will always contain these words, in the same order: The, quick, brown. But they might be separated by something else (space, &nbsp; or other HTML tags)
  3. String is part of field with more text, nested HTML tags, so the solution must ignore higher level <p></p> tags.
  4. We are talking about +20k matches, no manual edits solutions please :)

I have already tried doing it with RegExp but I can't filter for multiple keywords (AND operator).

I can export my DB to a sql file so I can use any solution you would recommend, Windows/Linux, text editor, js script etc. but I would appreciate the simplest and elegant solution.

Upvotes: 1

Views: 55

Answers (3)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627087

I think you have to restrict .* by a non-efficient but more precise (?:(?!<\/?p[^<]*>).)* that will force to match the words inside 1 <p> tag:

(?i)<p>(?:(?!<\/?p[^<]*>).)*the(?:(?!<\/?p[^<]*>).)*?quick(?:(?!<\/?p[^<]*>).)*?brown(?:(?!<\/?p[^<]*>).)*?<\/p>

See demo

Upvotes: 1

karthik manchala
karthik manchala

Reputation: 13640

You can use the following in any editor (say notepad++) or javascript or any PCRE engine with g, m, i modifiers to match:

^<p>.*?the.*?quick.*?brown.*?<\/p>$

Used .* instead of .+ because of your statement they MIGHT be separated by something else

and replace with '' (empty string)

Upvotes: 0

Yogesh_D
Yogesh_D

Reputation: 18809

This expression ^<p>.*The.*quick.*brown.*</p>\$ worked for me:

 [root@fedora ~]# grep "^<p>.*The.*quick.*brown.*</p>\$" test1.txt
<p><span style="font-size:16px"><strong>The quick brown &nbsp;</strong></span><strong><span style="font-size:16px">fox jumps.</span></strong></p>
<p><strong>The quick brown &nbsp;</strong></span><strong><span style="font-size:16px">fox jumps.</span></strong></p>
<p>The quick brown &nbsp;</strong></span><strong><span style="font-size:16px">fox jumps.</p>
[root@fedora ~]#

Upvotes: 0

Related Questions