Marc
Marc

Reputation: 2859

Regular expression - Text between colons

I have a schema like this

<h1>
5/2009
<br/>
Question: This is the question
</h1>

I like to get the first part after the <br/> or always the string before the colon :

--> Solution should be "Question"

Attention: This words change - Sometimes its question, othertime may be big question ....

I tried with <h1>(.{0,50}):(.{0,50}) but this returns to much (also the date)

I'm not trained with regex, can anyone help me with this?

Thank you alot.

Upvotes: 0

Views: 3937

Answers (4)

Marc
Marc

Reputation: 2859

my brain's floooding. really thanks to all who already helped.

may be anyone can try to help again is so important for me :S?

<ul>
<li>
07.05.2009:
<a href="#1">Test 1</a>
</li>
<li>
05.01.2009:
<a href="#2">Test 2</a>
</li>
</ul>

This Time I like to read the second part. The best thing would be, if I get both seperate in one regex..

So: 1. 07.05.2009 2. Test 1

Upvotes: 0

Dave Sherohman
Dave Sherohman

Reputation: 46187

Think about what you mean and translate that into the regex language. As Gumbo has pointed out, you should be using [^:] instead of .; the reason for this is that you are looking for groups of characters that aren't colons ([^:]), not for groups of absolutely any character at all[1] (.) which happen to have colons between them.

Any time you find yourself using . with a quantifier in a regex, stop and ask yourself whether you really mean "any character" or whether you could express your meaning more clearly (and get more accurate results) using a character class instead.

(Non-greedy quantifiers (.*?) can also do the job of getting correct matches in cases like this, but character classes are still a clearer expression of intent for human readers and improve efficiency by avoiding excessive backtracking for machine readers.)

[1] Well, absolutely any character at all, with the possible exception of newlines depending on the regex implementation that you're using.

Upvotes: 1

Aistina
Aistina

Reputation: 12683

I believe this will work:

<h1>.*?<br />([^:]+):(.*?)</h1>

Upvotes: 1

Gumbo
Gumbo

Reputation: 655129

Try this:

<br/>([^:]+):

Upvotes: 2

Related Questions