I have no idea to extract domain part from email address with pandas. In case if it is 'kkk@gmail.com' I would like to get 'gmail.com'. Please give me an idea.

Reputation: 2443

How to extract domain from email address with Pandas

I have no idea to extract domain part from email address with pandas. In case if it is '[email protected]' I would like to get 'gmail.com'.

Please give me an idea.

Upvotes: 5

Answers (3)

Wouter Adolfsen

Reputation: 23

Though both answers are useful, I did some checking on which one is the fastest. From the answers provided by Jezrael and Akhilesh only Jezrael's first method is robust against nan values. However, Akhilesh' answer is the fastest by quite a margin.

Timing was done as follows:

df = pd.DataFrame({'email':['[email protected]','[email protected]']})

def method1():
    df['domain'] = df['email'].str.split('@').str[1]
    return df

def method2():
    df['domain'] = df['email'].apply(lambda x: x.split('@')[1])
    return df

def method3():
    df['domain'] = [x.split('@')[1] for x in df['email']]

print('Time for method 1:', timeit.timeit(method1, number=100000))
print('Time for method 2:', timeit.timeit(method2, number=100000))
print('Time for method 3:', timeit.timeit(method3, number=100000))

Results with nan values:

Method 1: 16.07 seconds.
Method 2: Error.
Method 3: Error.

Results without nan values:

Method 1: 15.88 seconds.
Method 2: 9.03 seconds.
Method 3: 4.04 seconds.

Upvotes: 0

Akhilesh L.

Reputation: 11

This can also be done using lambda function.

df = pd.DataFrame({'email':['[email protected]','[email protected]', '[email protected]']})

df['domain'] = df['email'].apply(lambda x: x.split('@')[1])

Upvotes: 1

jezrael

Reputation: 863166

I believe you need split and select second value of lists by indexing:

df = pd.DataFrame({'email':['[email protected]','[email protected]']})

df['domain'] = df['email'].str.split('@').str[1]
#faster solution if no NaNs values 
#df['domain'] = [x.split('@')[1] for x in df['email']]
print (df)
           email     domain
0  [email protected]  gmail.com
1   [email protected]  yahoo.com

Upvotes: 18

How to extract domain from email address with Pandas

Answers (3)

Related Questions