Reputation: 3261
I'm using objective c to create a program that will pull out data from a HTML file using regexes. The only lines that are important to the program contain the text popupName
and I need to stip all HTML tags from it as well. Can this be done with one regex?
So far I have been using popupName
to find the line I am looking for and then deleting everything matching <[^>]*>
.
Could these two operations be combined into one?
Here's example input:
<div>
<div class="popupName"> Bob Smith</div>
<div class="popupTitle">
<i></i>
</div>
<br />
<div class="popupTitle"></div>
<div class="popupLink"><a href="mailto:"></a></div>
</div>
From that I would like to extract only "Bob Smith". Except, I would have multiple occurrences of the line names like that.
Upvotes: 0
Views: 213
Reputation: 47284
Your pattern is pretty close to what you would likely want with the addition of:
"popupName">(.*)|<[^>]*>
Adding "popupName" followed by a capture group will allow you to grab the specific info you want.
In Objective-C:
NSString* searchText = @"<div><div class=\"popupName\"> Bob Smith</div><div class=\"popupTitle\"><i></i></div><br /><div class=\"popupTitle\"></div><div class=\"popupLink\"><a href=\"mailto:\"></a></div></div><div>";
NSString *pattern = @"\"popupName\">(.*)|<[^>]*>";
NSRange searchRange = NSMakeRange(0, [searchText length]);
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
NSString *results = [regex stringByReplacingMatchesInString:searchText options:0 range:searchRange withTemplate:@"$1"];
NSLog(@"results: %@",results);
Result:
results: Bob Smith
Upvotes: 2
Reputation: 539
I've been playing with this for a bit, but I'm using javascript and can't do a positive lookbehind. But if your objective C can let you do a positive lookbehind and positive lookahead, you should be able to do this.
Upvotes: 0