luknajirku
luknajirku

Reputation: 85

How to unnest a column in a Pandas DataFrame?

I have the following this question where one of the columns is an object (list type cell):

I don't want to use explode (using an older version of pandas). How to do the same for dataframe with three columns?

df

   A       B     C
0  1  [1, 2]     3
1  1  [1, 2]     4
2  2  [3, 4]     5

My expected output is:

   A  B  C
0  1  1  3
1  1  2  3
3  1  1  4
4  1  2  4
5  2  3  5
6  2  4  5

I found these two methods useful. How to add third column to this code.

df.set_index('A').B.apply(pd.Series).stack().reset_index(level=0).rename(columns={0:'B'})

or

df=pd.DataFrame({'A':df.A.repeat(df.B.str.len()),'B':np.concatenate(df.B.values)})

Upvotes: 1

Views: 902

Answers (2)

sammywemmy
sammywemmy

Reputation: 28644

You can use itertools to reshape your data :

from itertools import product,chain

pd.DataFrame(chain.from_iterable((product([a],b,[c])) 
                                 for a,b,c in df.to_numpy()),
             columns = df.columns)


    A   B   C
0   1   1   3
1   1   2   3
2   1   1   4
3   1   2   4
4   2   1   5
5   2   4   5

Upvotes: 0

ALollz
ALollz

Reputation: 59529

You set the index to be all of the columns you want to keep tied to the list you explode:

(df.set_index(['A', 'C'])['B']
   .apply(pd.Series).stack()
   .reset_index()
   .drop(columns='level_2').rename(columns={0: 'B'}))

   A  C  B
0  1  3  1
1  1  3  2
2  1  4  1
3  1  4  2
4  2  5  3
5  2  5  4

Or for the second method also repeat 'C'

pd.DataFrame({'A': df.A.repeat(df.B.str.len()),
              'C': df.C.repeat(df.B.str.len()),
              'B': np.concatenate(df.B.to_numpy())})

Upvotes: 2

Related Questions