Reputation: 579
Here is an example of my data sets.
d = {'numbers': [['1.9x1.4x2.0','1.5x1.1x1.3','11','8x10','3.7x3.8'],['1.0x1.5', '1.7x0.7', '1.4', '0.8', '3.4x4.2x4.5', '1.0x1.5']]}
df2 = pd.DataFrame(data=d)
I want to extract the first numbers from each element separated by comma and convert it into the float. So my expected output is
df2['output]=[[1.9,1.5,11,8,3.7],[1.0,1.7,1.4,0.8,3.4,1.0]]
I am not sure how to get the first element when x is there, str[0] will not work, otherwise what I can think of is
df2.numbers.apply(lambda x: x.split(',') ).apply(lambda x: [float(i) for i in x])
But this would work if x was not there. Please help!
Upvotes: 2
Views: 172
Reputation: 14721
Using Regex in case There is different letter not just x
:
import pandas as pd
import re
d = {'numbers': [['1.9x1.4x2.0','1.5d1.1x1.3','11','8z10','3.7x3.8'],
['1.0x1.5', '1.7x0.7', '1.4', '0.8', '3.4x4.2x4.5', '1.0x1.5']]}
df2 = pd.DataFrame(data=d)
df2['output'] = df2['numbers'].apply(lambda cell: [re.search('\d+(\.\d+)?', value).group(0) for value in cell])
Upvotes: 1
Reputation: 82765
Using apply
Ex:
d = {'numbers': [['1.9x1.4x2.0','1.5x1.1x1.3','11','8x10','3.7x3.8'],['1.0x1.5', '1.7x0.7', '1.4', '0.8', '3.4x4.2x4.5', '1.0x1.5']]}
df2 = pd.DataFrame(data=d)
df2['output']= df2["numbers"].apply(lambda x: [i.split("x")[0] for i in x])
print(df2)
Output:
numbers output
0 [1.9x1.4x2.0, 1.5x1.1x1.3, 11, 8x10, 3.7x3.8] [1.9, 1.5, 11, 8, 3.7]
1 [1.0x1.5, 1.7x0.7, 1.4, 0.8, 3.4x4.2x4.5, 1.0x... [1.0, 1.7, 1.4, 0.8, 3.4, 1.0]
Upvotes: 1