Reputation: 366
I'm trying to remove the following string/line in my SQL database:
<p><span style="font-size:16px"><strong>The quick brown </strong></span><strong><span style="font-size:16px">fox jumps.</span></strong></p>
<p>
and end with </p>
The
, quick
, brown
. But they might be separated by something else (space,
or other HTML tags)<p></p>
tags.I have already tried doing it with RegExp but I can't filter for multiple keywords (AND
operator).
I can export my DB to a sql file so I can use any solution you would recommend, Windows/Linux, text editor, js script etc. but I would appreciate the simplest and elegant solution.
Upvotes: 1
Views: 55
Reputation: 627087
I think you have to restrict .*
by a non-efficient but more precise (?:(?!<\/?p[^<]*>).)*
that will force to match the words inside 1 <p>
tag:
(?i)<p>(?:(?!<\/?p[^<]*>).)*the(?:(?!<\/?p[^<]*>).)*?quick(?:(?!<\/?p[^<]*>).)*?brown(?:(?!<\/?p[^<]*>).)*?<\/p>
See demo
Upvotes: 1
Reputation: 13640
You can use the following in any editor (say notepad++) or javascript or any PCRE engine with g
, m
, i
modifiers to match:
^<p>.*?the.*?quick.*?brown.*?<\/p>$
Used .*
instead of .+
because of your statement they MIGHT be separated by something else
and replace with ''
(empty string)
Upvotes: 0
Reputation: 18809
This expression ^<p>.*The.*quick.*brown.*</p>\$
worked for me:
[root@fedora ~]# grep "^<p>.*The.*quick.*brown.*</p>\$" test1.txt
<p><span style="font-size:16px"><strong>The quick brown </strong></span><strong><span style="font-size:16px">fox jumps.</span></strong></p>
<p><strong>The quick brown </strong></span><strong><span style="font-size:16px">fox jumps.</span></strong></p>
<p>The quick brown </strong></span><strong><span style="font-size:16px">fox jumps.</p>
[root@fedora ~]#
Upvotes: 0