Reputation: 17704
How can the array (the length of the array is constant for all elements in the series) be extracted into columns efficiently?
import pandas as pd
d = pd.DataFrame({'foo':\[1,2,3\], 'bar':\[\[1,1,1\], \[2,2,2\], \[3,3,3\]\]})
d][1]][1]
I.e. extract the array of [1,1,1]
into a bar_0, bar_1, bar_3
column?
Is there a better way than manually iterating over the indices in the array and calling pandas.apply
?
Upvotes: 2
Views: 1972
Reputation: 1213
How about this:
>>> d.join(pd.DataFrame(d['bar'].to_list(), columns=['bar_1', 'bar_2', 'bar_3']))
foo bar bar_1 bar_2 bar_3
0 1 [1, 1, 1] 1 1 1
1 2 [2, 2, 2] 2 2 2
2 3 [3, 3, 3] 3 3 3
You convert the bar
column to list (nested list), convert it to a dataframe, and join the new dataframe with your initial dataframe.
Upvotes: 5
Reputation: 195553
import pandas as pd
d = pd.DataFrame({"foo": [1, 2, 3], "bar": [[1, 1, 1], [2, 2, 2], [3, 3, 3]]})
d = pd.concat([d, d.pop("bar").apply(pd.Series).add_prefix("bar_")], axis=1)
print(d)
Prints:
foo bar_0 bar_1 bar_2
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
Upvotes: 5