Reputation: 127
I am trying to get the email provider from the mail column of the Dataframe and create a new column named "Mail_Provider". For example, taking gmail from [email protected] and storing it in "Mail_Provider" column. Also I would like to extract Country ISD fro Phone column and Create a new column for that. Is there any other straight/simpler method other than regex.
data = pd.DataFrame({"Name":["A","B","C"],"mail":
["[email protected]","[email protected]","[email protected]"],"Adress":
["Adress1","Adress2","Adress3"],"Phone":["+91-1234567890","+88-
0987654321","+27-2647589201"]})
Table
Name mail Adress Phone
A [email protected] Adress1 +91-1234567890
B [email protected] Adress2 +88-0987654321
C [email protected] Adress3 +27-2647589201
Result expected:-
Name mail Adress Phone Mail_Provider ISD
A [email protected] Adress1 +91-1234567890 gmail 91
B [email protected] Adress2 +88-0987654321 yahoo 88
C [email protected] Adress3 +27-2647589201 gmail 27
Upvotes: 5
Views: 1864
Reputation: 150735
Regex is rather simple as these:
data['Mail_Provider'] = data['mail'].str.extract('\@(\w+)\.')
data['ISD'] = data['Phone'].str.extract('\+(\d+)-')
If you really want to avoid regex, @Eva's answer would be the way to go.
Upvotes: 9
Reputation: 92854
Mixed approach (regex and simple slicing):
In [693]: df['Mail_Provider'] = df['mail'].str.extract('@([^.]+)')
In [694]: df['ISD'] = df['Phone'].str[1:3]
In [695]: df
Out[695]:
Name mail Adress Phone Mail_Provider ISD
0 A [email protected] Adress1 +91-1234567890 gmail 91
1 B [email protected] Adress2 +88-0987654321 yahoo 88
2 C [email protected] Adress3 +27-2647589201 gmail 27
Upvotes: 5
Reputation: 670
A lambda function will work
data['Mail_Provider'] = data['mail'].apply(lambda x: x.split("@")[1].split(".")[0])
data['ISD'] = data['Phone'].apply(lambda x: x.split("+")[1].split("-")[0])
Upvotes: 4