DannyhelMont
DannyhelMont

Reputation: 390

How to match all strings found unless it contains a specific word?

so I'm trying to make this regular expression string to work to find matches on a text, that exclude matches that have "avatar" word in the middle of it:

http.{5,10}(media.tumblr).+?(?!avatar).+?(png|jpg|jpeg|gif|swf)/g

on this generic Sample, it should match only the 1º and 3º one, but it matches all(quotes the specific matched):

1º"httpghghghmedia.tumblrfgdfdfgif"rdfgifgjdthythpng
2º"httpahttvhmedia.tumblrffdfavatarfgdfdfgif"rdfgifadadadadad
3ºdg"httpghghghmedia.tumblrfgdfdfgif"addadaa
4ºuilfgfgfpo"httpdsfsdfmedia.tumblrDDavatarsdfsdpng"

I tried other methods, some were even too complicated to make it work, but most of the answers to this same question or references did not consider when its needed to use twice the ".+?" or multiple search patterns as "media.tumblr" & "gif" on different places separated by ".+?" which from my tests the one before the negative lookbehind make the negative lookbehind to be ignored, so anyone can please tell me if there's a method to fix this? regex101 and help tutorials didn't helped me :/

Upvotes: 2

Views: 584

Answers (1)

Burdui
Burdui

Reputation: 1302

TL;DR Full Regex

http.{5,10}(?:media.tumblr)(?:(?!avatar).)+?(?:png|jpg|jpeg|gif|swf)

Why it fails

.+?(?!avatar).+?<anything else>

The first .+? matches one character (because it is lazy quantified). If the string avatar is found next then it will also match the a of avatar The second .+? matches everything else untill anything else can be matched.

A solution

Replace the part with

(?:(?!avatar).)+?<anything else>

Why it works

(?!avatar). matches a single character that is not the start of a string avatar. The part (?:(?!avatar).)+? (lazily) matches all characters that fulfill this property. And if neither of the characters is the starting character of avatar then the string can not be contained.

Upvotes: 2

Related Questions