Reputation: 554
Assuming having the following text,
dogs are very nice <a href="http://dogs.com">read about nice dogs here</a>
I need to change everything that is not in angle brackets, so the text will be
cats are very nice <a href="http://dogs.com">read about nice cats here</a>
I've found that regex \([^)]*\)
can come in handy here, but it looks that it does not working:
s = 'dogs are very nice <a href="http://dogs.com">read about nice dogs here</a>'
s = re.sub(r'\([^)]*\)', 'cats', s)
print(s)
'dogs are very nice <a href="http://dogs.com">read about nice dogs here</a>'
I'm sorry if this question looks lame, but I'm really new to regex. Thanks for your help.
Upvotes: 0
Views: 30
Reputation: 4689
This regex pattern doesn't seem to have anything to do with what you want - there isn't even a mention of "dog" in there, let alone angle brackets. What it does, specifically, is match any text inside round parentheses (eg. (abc)
).
More generally, I don't think you'll be able to use regular expressions here.
If the HTML doesn't contain any other angle brackets (quite an assumption), you might be successful with (<[^<>]*>[^<>]*)*dogs
, which should match "dogs" only if each "<" preceding it is eventually followed by a ">".
But seriously, just install something like Beautiful Soup and parse the HTML; it's easy and a lot more robust.
Upvotes: 1