Reputation: 476
I am trying to extract first 5 character+digit from last hyphen. Here is the example
Output -- X008-TGa19-ER751
Output -- X002-KF13-ER782
My attempt -- I could manage to take element from the last -- (\w+)[^-.]*$
But now how to take first 5, then return my the entire value as the output as shown in the example.
Upvotes: 0
Views: 1211
Reputation: 163577
You can optionally repeat a -
and 1+ word chars from the start of the string. Then match the last -
and match 5 word chars.
^\w+(?:-\w+)*-\w{5}
^
Start of string\w+
Math 1+ word chars(?:-\w+)*
Optionally repeat -
and 1+ word chars-\w{5}
Match -
and 5 word charsimport re
regex = r"^\w+(?:-\w+)*-\w{5}"
s = ("X008-TGa19-ER751QF7\n"
"X002-KF13-ER782cPU80")
print(re.findall(regex, s, re.MULTILINE))
Output
['X008-TGa19-ER751', 'X002-KF13-ER782']
Note that \w
can also match _
.
If there can also be other character in the string, to get the first 5 digits or characters except _
after the last hyphen, you can match word characters without an underscore using a negated character class [^\W_]{5}
Repeat that 5 times while asserting no more underscore at the right.
^.*-[^\W_]{5}(?=[^-]*$)
Upvotes: 1
Reputation: 40066
^(.*-[^-]{5})[^-]*$
Capture group 1 is what you need
https://regex101.com/r/SYz9i5/1
Explanation
^(.*-[^-]{5})[^-]*$
^ Start of line
( Capture group 1 start
.* Any number of any character
- hyphen
[^-]{5} 5 non-hyphen character
) Capture group 1 end
[^-]* Any number of non-hyphen character
$ End of line
Another simpler one is
^(.*-.{5}).*$
This should be quite straight-forward.
This is making use of behaviour greedy match of first .*
, which will try to match as much as possible, so the -
will be the last one with at least 5 character following it.
https://regex101.com/r/CFqgeF/1/
Upvotes: 1
Reputation: 48110
If you are open for non-regex solution, you can use this which is based on splitting, slicing and joining the strings:
>>> my_str = "X008-TGa19-ER751QF7"
>>> '-'.join(s[:5] for s in my_str.split('-'))
'X008-TGa19-ER751'
Here I am splitting the string based on hyphen -
, slicing the string to get at max five chars per sub-string, and joining it back using str.join()
to get the string in your desired format.
Upvotes: 1
Reputation: 4075
(\w+-\w+-\w{5})
seems to capture what you're asking for.
Example:
https://regex101.com/r/PcPSim/1
Upvotes: 1