Will
Will

Reputation: 2410

python regex return non-capturing group

I want to generate a username from an email with :

eg :

getUsername("[email protected]")
mmylastn

Here is getUsername's code :

def getUsername(email) :
   re.match(r"(.){1}[a-z]+.([a-z]{7})",email.replace('-','')).group()

If I do .group(1,2) I can see the captured groups are m and mylastn, so it's all good. But using .group() doesn't just return the capturing group but also everthing between them : myfirstnamemlastn

Can someone explain me this behavior ?

Upvotes: 3

Views: 149

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626804

First of all, a . in a pattern is a metacharacter that matches any char excluding line break chars. You need to escape the . in the regex pattern

Also, {1} limiting quantifier is always redundant, you may safely remove it from any regex you have.

Next, if you need to get a mmylastn string as a result, you cannot use match.group() because .group() fetches the overall match value, not the concatenated capturing group values.

So, in your case,

  • Check if there is a match first, trying to access None.groups() will throw an exception
  • Then join the match.groups()

You can use


import re
def getUsername(email) :
    m = re.match(r"(.)[a-z]+\.([a-z]{7})",email.replace('-',''))
    if m:
        return "".join(m.groups())
    return email

print(getUsername("[email protected]"))

See the Python demo.

Upvotes: 2

Related Questions