JavaSa
JavaSa

Reputation: 6241

Using Regex to catch text until first occurrence of certain character

Consider I have the following text: Temp:C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2, adfsafd1242412,

And I want to catch all the data after Temp: and until first occurrence of , which means: C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2

I tried using regex Temp:(.+,) without success
How do I tell the regex that , should be the first found?

Upvotes: 9

Views: 20486

Answers (4)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626689

To capture the value you need, you could try and use lazy matching dot (.+? matches 1 or more characters - but as few as possible - that are any characters but a newline):

Temp:(.+?),

Since lazy matching might eat up more than you need, a negated character class ([^,]+ matches 1 or more characters other than a comma) looks preferable:

Temp:([^,]+)

The result is captured into Group 1 with the capturing group (parentheses).

IDEONE sample code:

import re
p = re.compile(r'Temp:([^,]+)')
test_str = "Temp:C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2, adfsafd1242412,"
print (re.search(p, test_str).group(1))

Output: C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2

NOTE that a look-around based solution is more resource-consuming that the capturing group one that you and I are using.

Upvotes: 20

Tomasz Nguyen
Tomasz Nguyen

Reputation: 2611

Try the following regexp: Temp:([^,]+,)

Now, anything after Temp: is included until the first ,.

Upvotes: 0

wojtossfm
wojtossfm

Reputation: 61

You need to use the ? character to make the + ungreedy. Otherwise all the , characters also match with the .+ part of the regex. In my answer I moved the , outside of the group since from your description I understood that you actually didn't want it in the match.

import re
matcher = re.compile("Temp:(.+?),")
matcher.match(a).group(1)

Upvotes: 2

anubhava
anubhava

Reputation: 784948

You can use this lookbehind based regex:

(?<=Temp:)[^,]+

RegEx Demo

Code:

s='Temp:C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2, adfsafd1242412,'
print re.search(r"(?<=Temp:)[^,]+", s).group()

Output:

C5E501374D0343090957F7E5929E765C931F7D3EC7A96189FDA88549D54D9E4E5DB3FC1C2

Upvotes: 5

Related Questions