Reputation: 2442
I have a dataframe in pandas that looks like this:
df = pd.DataFrame([[4, 9],[4,9],[[1,2],[3,4]]], columns=['A', 'B'])
df
A B
0 4 9
1 4 9
2 [1, 2] [3, 4]
However I would like to transform it to a table like this:
A B
0 4 9
1 4 9
2 1 2
3 3 4
Is there a way to apply a row wise function (using df.apply(function,axis=1,...)
or some other function in pandas) to do that?
Upvotes: 4
Views: 122
Reputation: 64
There is one problem in the question, it is not certain the list items in the same row always has same length. If that presumption is satisfied, then following answer works for it:
df.apply(pd.Series.explode)
A B
0 4 9
1 4 9
2 1 3
2 2 4
Upvotes: 0
Reputation: 13397
You can do:
#main piece - the rest is actually 'fixing' the multiindex piece to fit your purpose:
df=df.stack().explode().to_frame()
df["id"]=df.groupby(level=[0,1]).cumcount()
df.index=pd.MultiIndex.from_tuples(zip(df.index.get_level_values(0)+df['id'], df.index.get_level_values(1)))
df=df.drop(columns="id").unstack()
df.columns=map(lambda x: x[1], df.columns)
Outputs:
>>> df
A B
0 4 9
1 4 9
2 1 3
3 2 4
Upvotes: 1
Reputation: 6495
Another possible solution to all others proposed so far using DataFrame.melt, DataFrame.explode and DataFrame.pivot:
import pandas as pd
df = pd.DataFrame([[4, 9],[4,9],[[1,2],[3,4]]], columns=['A', 'B'])
# Create index column
df.reset_index(inplace=True)
tmp = df.melt(id_vars='index', var_name='columns').explode('value')
# Define indexes
idx = sum([list(range(len(tmp)//tmp['columns'].nunique())) for _ in range(tmp['columns'].nunique())], [])
tmp['index'] = idx
result_df = tmp.pivot(index='index', columns='columns', values='value')
result_df
columns A B
index
0 4 9
1 4 9
2 1 3
3 2 4
Upvotes: 0
Reputation: 31
Using simple for and if loops:
alist = df['A'].tolist()
blist = df['B'].tolist()
alist1=[]
blist1=[]
for k,r in zip(alist,blist):
if isinstance(k,list):
alist1.append(k[0])
blist1.append(k[1])
if isinstance(r,list):
alist1.append(r[0])
blist1.append(r[1])
else:
alist1.append(k)
blist1.append(r)
df = pd.DataFrame({'A': alist1, 'b': blist1})
Upvotes: 0
Reputation: 42926
Using DataFrame.apply
, Series.explode
, DataFrame.mask
and DataFrame.where
:
types = df.applymap(type).eq(list)
arr = df.where(types).apply(pd.Series.explode).dropna().T.to_numpy()
df.mask(types).dropna().append(pd.DataFrame(arr, columns=df.columns), ignore_index=True)
A B
0 4 9
1 4 9
2 1 2
3 3 4
Upvotes: 0
Reputation: 863291
Use lis comprehension with flatten values by chain
:
from itertools import chain
out = list(chain.from_iterable(item if isinstance(item[0],list)
else [item] for item in df[['A','B']].values))
df1 = pd.DataFrame(out, columns=['A','B'])
Or loop alternative:
out = []
for x in df[['A','B']].values:
if isinstance(x[0], list):
for y in x:
out.append(y)
else:
out.append(x)
df1 = pd.DataFrame(out, columns=['A','B'])
print (df1)
A B
0 4 9
1 4 9
2 1 2
3 3 4
Upvotes: 3
Reputation: 22503
Using list comprehension with concat
:
df = pd.DataFrame([[4, 9],[4,9],[[1,2],[3,4]],], columns=['A', 'B'])
print (pd.concat([df.loc[:1], *[pd.DataFrame(list(i),columns=df.columns) for i in df.loc[2:].to_numpy()]],
ignore_index=True))
A B
0 4 9
1 4 9
2 1 2
3 3 4
Upvotes: 1