Reputation: 724
I have a pandas dataframe and in one column I have a string where words are separated by '_', I would like to extract the last element of this string (which is a number) and make a new column with this. I tried the following
df = pd.DataFrame({'strings':['some_string_25','a_different_one_13','and_a_last_one_40']})
df.assign(number = lambda x: x.strings.str.split('_')[0])
but it gives me this in my last column
number
some
string
25
but I would like to get this
number
25
13
40
How can I do this?
Upvotes: 3
Views: 4399
Reputation: 860
Please try this
df = pd.DataFrame({'strings':['some_string_25','a_different_one_13','and_a_last_one_40']})
df['number'] = df.strings.apply(lambda x: x.split('_')[-1])
df
Upvotes: 0
Reputation: 862511
Use Series.str.split
for split and select last value of list by indexing or use Series.str.extract
by last integer of strings - (\d+)
is for match int and $
for end of string:
df['last'] = df['strings'].str.split('_').str[-1]
df['last1'] = df['strings'].str.extract('(\d+)$')
print (df)
strings last last1
0 some_string_25 25 25
1 a_different_one_13 13 13
2 and_a_last_one_40 40 40
Difference is possible see in changed data:
df = pd.DataFrame({'strings':['some_string_25','a_different_one_13','and_a_last_one_40',
'aaaa', 'sss58']})
df['last'] = df['strings'].str.split('_').str[-1]
df['last1'] = df['strings'].str.extract('(\d+)$')
print (df)
strings last last1
0 some_string_25 25 25
1 a_different_one_13 13 13
2 and_a_last_one_40 40 40
3 aaaa aaaa NaN
4 sss58 sss58 58
Upvotes: 9
Reputation: 7852
Can do:
df['number']=df['strings'].apply(lambda row: row.split('_')[-1])
or:
df['number']=[row[-1] for row in df['strings'].str.split('_')]
Upvotes: 1