Guru Prasad
Guru Prasad

Reputation: 127

Matching @user with regex

How do I match words that begin with @ and ends with ;, ., :, or   ?

The words can have any alphanumeric characters and may consist of underscores.

I have come up with ^@([a-zA-Z0-9_])*[:;, ]$ which seems to work for single word sentences alone.

Upvotes: 1

Views: 239

Answers (2)

nhahtdh
nhahtdh

Reputation: 56809

Just remove the anchor ^ and $ and you will be good to go.

In case you don't want to match empty string from "Example @ nothing", you may want to specify "1 or more qualifier" + instead of *. i.e. @([a-zA-Z0-9_]+)[:;, ]

Restricting to 1-15 character username can be done by replacing * with {1,15}, i.e. @([a-zA-Z0-9_]{1,15})[:;, ].

If you want to get @ sign plus the ending characters as result, @[a-zA-Z0-9_]{1,15}[:;, ] is sufficient.

If you want to capture the name only, you can use this @([a-zA-Z0-9_]{1,15})[:;, ]

In case the token is right at the end of the string and without the special characters, and you want to capture it, you may want to modify [:;, ] to (?:[:;, ]|$)

Upvotes: 4

Blair
Blair

Reputation: 15788

^ matches the start of a string (or line, in multi-line mode), while $ matches the end, so you need to get rid of them:

>>> import re
>>> sentence = "foo bar @match don't match @success;"
>>> re.findall('@([a-zA-Z0-9_])*[:;, ]', sentence)
['h', 's']

It is only capturing the last letter because the qualifier (the *) is outside the brackets matching the capture. Move it inside and you get:

>>> re.findall('@([a-zA-Z0-9_]*)[:;, ]', sentence)
['match', 'success']

If you want to capture the @ and trailing character too, just move them inside the brackets as well:

>>> re.findall('(@[a-zA-Z0-9_]*[:;, ])', sentence)
['@match ', '@success;']

And as mentioned in the comments on the question, you may or may not want to restrict it to a certain number of characters:

>>> sentence = "foo bar @match don't match @somereallylongnamehere @success;"
>>> re.findall('(@[a-zA-Z0-9_]{1,15}[:;, ])', sentence)
['@match ', '@success;']

(Of course, the length restriction could be added to any of the previous expressions, not just this last one).

Upvotes: 3

Related Questions