PabloG
PabloG

Reputation: 454

At a dataframe how to explode a column with a list (with same length at all rows) into different columns at the same row

I have following dataframe:

df=pd.DataFrame({'A': ['1','2', '3'], 'List': [['a1','a2'], ['b1','b2'], ['c1','c2']]})

Out[18]: 
   A      List
0  1  [a1, a2]
1  2  [b1, b2]
2  3  [c1, c2]

I would like to explode the column List into two new columns (L1 and L2) at the same row.

   A  L1  L2
0  1  a1  a2
1  2  b1  b2
2  3  c1  c2

Which would be the fastest way to do it?

It would be great to assign also the names for the columns at the same time (L1 and L2).

Thank you in advance and best regards,

Pablo G

Upvotes: 2

Views: 89

Answers (2)

CypherX
CypherX

Reputation: 7353

Solution

Try this: pd.concat + df[col].apply(pd.Series)

# Option-1
pd.concat([df['A'], df['B'].apply(pd.Series).rename(columns={0: 'L1', 1: 'L2'})], axis=1)

# Option-2
# credit: Mark Wang; for suggestion on using, index = ['L1', 'L2']
pd.concat([df['A'], df['B'].apply(pd.Series, index=['L1', 'L2'])], axis=1)

enter image description here

If you want to keep only the columns L1 and L2

# Option-1
df['B'].apply(pd.Series).rename(columns={0: 'L1', 1: 'L2'})

# Option-2
# credit: Mark Wang; for suggestion on using, index = ['L1', 'L2']
df['B'].apply(pd.Series, index=['L1', 'L2'])

If you want to keep all the original columns

# with prefix
pd.concat([df, df['B'].apply(pd.Series).add_prefix(f'B_')], axis=1)

# with user given column-names
pd.concat([df, df['B'].apply(pd.Series).rename(columns={0: 'L1', 1: 'L2'})], axis=1)

Logic:

  • Concat df and df_expanded along the columns (axis=1).
  • Where, df_expanded is obtained by doing df[col].apply(pd.Series). This expands the lists into columns.
  • I added a .add_prefix('B_') to add clarity on where the columns originated from (column B).

Example

df = pd.DataFrame({'A': [1,2,3], 
                   'B': [['11', '12'], 
                         ['21', '22'], 
                         ['31', '32']]
                   })
col = 'B'
pd.concat([df, df[col].apply(pd.Series).add_prefix(f'{col}_')], axis=1)

enter image description here

Upvotes: 1

Mark Wang
Mark Wang

Reputation: 2757

Try:

df[['A']].join(df['List'].apply(pd.Series, index=['L1', 'L2']))

Upvotes: 2

Related Questions