Reputation: 859
Using the python re.sub, is there a way I can extract the first alpha numeric characters and disregard the rest form a string that starts with a special character and might have special characters in the middle of the string? For example:
re.sub('[^A-Za-z0-9]','', '#my,name')
How do I just get "my"?
re.sub('[^A-Za-z0-9]','', '#my')
Here I would also want it to just return 'my'.
Upvotes: 1
Views: 1458
Reputation: 13049
re.sub(".*?([A-Za-z0-9]+).*", r"\1", str)
The \1
in the replacement is equivalent to matchobj.group(1)
. In other words it replaces the whole string with just what was matched by the part of the regexp inside the brackets. $
could be added at the end of the regexp for clarity, but it is not necessary because the final .*
will be greedy (match as many characters as possible).
This solution does suffer from the problem that if the string doesn't match (which would happen if it contains no alphanumeric characters), then it will simply return the original string. It might be better to attempt a match, then test whether it actually matches, and handle separately the case that it doesn't. Such a solution might look like:
matchobj = re.match(".*?([A-Za-z0-9]+).*", str)
if matchobj:
print(matchobj.group(1))
else:
print("did not match")
But the question called for the use of re.sub
.
Upvotes: 2
Reputation: 49
This is not a complete answer. [A-Za-z]+
will give give you ['my','name']
Use this to further explore: https://regex101.com/
Upvotes: 0