Liutas
Liutas

Reputation: 5783

How find every thing but not one word

In text i want to find structures like every thing till some text, but not match between some word.

Example in text:

Templates : You can add custom templates for your theme. Updated on 2010 look[124] end
Media RSS feed : Add the Cooliris Effect to your gallery Updated on 2011 look[124]
Role settings : Each gallery has a author Updated at 2010 ...  look[124] end
AJAX based thumbnail generator : No more server Updated on 2010 look[124] end limitation during the batch process Copy/Move : Copy or move images between Updated on 2010 this look[124] galleries Sortable Albums : Create your own sets of images Updated on 2010 this look[124] end
Upload or pictures via a zip-file (Not in Safe-mode)
Watermark function : You can add a watermark image or text 
...

I need to find "Updated .*[124] end" every match must start this "Update" and ends with "[number]" and word "end". But some text looks very similar, but not ends with word "end". This text must not mach. How to make it work?

I try to write

/Updated(.*?)\[.*?\]\send/msi

or

Updated(.*?)\[.*?\](?!Updated)\send

But this takes strings like:

Updated on 2011 look[124] Role settings : Each gallery has a author Updated at 2010 ...  look[124] end
Updated on 2010 this look[124] galleries Sortable Albums : Create your own sets of images Updated on 2010 this look[124] end

How to write regex witch skips bad matches?

http://regexr.com?2vh1j

Thanks for your opinion.

Upvotes: 3

Views: 119

Answers (6)

Dan
Dan

Reputation: 2341

Maybe you can try a different approach:

/Updated[\w.\s]*\[\d+\]\send/

Explanation:

Updated

This will match the word Updated

[\w\d.\s]*

then all letters, numbers, spaces and dots (u can add any characters u wish)

\[\d+\]

then a number between brackets

\send

than a space and finally the word end

Upvotes: 0

Alan Moore
Alan Moore

Reputation: 75272

I think this is what you were trying for with your second regex:

Updated\s++(?>(?!Updated\b|end\b)\S+\s+)*+end\b

In other words, match Updated and look for the corresponding end. If you find another Updated first, you know you started at the wrong place, so abandon that match. I excluded end as well because that lets me match the words possessively (i.e., with *+); the regex never has to backtrack to find or (more importantly) eliminate a match.

If you really have to specify the look[nnn] part, this should do the trick:

Updated\s++(?>(?!Updated\b|end\b|look\[\d+\])\S+\s+)*+look\[\d+\]\s+end\b

Add the i flag for a case-insensitive match if you need to, but you don't need the m or s flags. If this seems overly complicated, it's because I don't know your data as well as you do. There's a good chance this is all you really need:

Updated(?:(?!Updated).)*\send

Upvotes: 1

Birei
Birei

Reputation: 36292

One possibility:

Updated([^[]*)\[124\]\s+end

Explanation:

Updated          # Word 'updated'
[^[]*            # All chars until '['
\[124\]          # String '[124]'
\s+              # One or more spaces.
end              # String 'end'

Upvotes: 0

Qtax
Qtax

Reputation: 33928

To match a string that does not contain Updated you can use constructs like:

(?:[^U]+|U(?!pdated))*

and

(?:(?!Updated).)*

Using the first alternative would give you an expression like:

Updated((?:[^U]+|U(?!pdated))*)\[\d+\]\send

First alternative explained:

(?:          # non-capturing group
[^U]+        # any characters that aren't "U"
|U(?!pdated) # or a "U" which is not followed bu "pdated" (ie. not "Updated")
)*           # repeated as much as possible

Second alternative:

(?:          # non-capturing group
(?!Updated). # Use a lookahead check at every character to make sure it's not "Updated"
)*           # repeated as much as possible

Upvotes: 1

Xophmeister
Xophmeister

Reputation: 9219

Use lazy regexp

Updated.*?\[.*?\]( end)?

Upvotes: 0

Karl Bielefeldt
Karl Bielefeldt

Reputation: 49148

Assuming all the invalid matches have a [124], but not an end, you can filter those out by not allowing a [ between Updated and the end sequence, like this:

Updated([^[]*?)\[\d*\]\send

Upvotes: 1

Related Questions