Reputation: 1
I have been trying at this all day, and can't find a solution. Here is my current code:
stranger = re.search(r"Stranger:</strong> <span>.+?</span></p></div></div></div>", html2)
I am wanting an outcome like this:
"Stranger:</strong> <span>What now?</span></p></div></div></div>" = True
from a string like this:
"<div class=\"logitem\"><p class=\"strangermsg\"><strong class=\"msgsource\">Stranger:</strong> <span>Wow</span></p></div><div class=\"logitem\"><p class=\"youmsg\"><strong class="msgsource">You:</strong> <span>Eek</span></p></div><div class=\"logitem\"><p class=\"strangermsg\"><strong class=\"msgsource\">Stranger:</strong> <span>What now?</span></p></div></div></div>"
Instead I get this:
"Stranger:</strong> <span>Wow</span></p></div><div class=\"logitem\"><p class=\"youmsg\"><strong class=\"msgsource\">You:</strong> <span>Eek</span></p></div><div class=\"logitem\"><p class=\"strangermsg\"><strong class=\"msgsource\">Stranger:</strong> <span>What now?</span></p></div></div></div>" = True
Basically I am wanting to get everything from before the "/span p div div div" and after the previous instance of "span" (no /). I've tried all kinds of things, but I don't know what I could possibly do. Anyone able to help here?
Upvotes: 0
Views: 50
Reputation: 483
Try specifying that between the two inner tags, don't allow special control sequences. For example,
stranger = re.search(r"Stranger:</strong> <span>[^<>]+?</span></p></div></div></div>", html2)
This means that whatever is between those two inner tags, there cannot be other <
or >
characters.
Upvotes: 1