Reputation: 53
So let's I have a table with these values
Name | Transportation |
---|---|
Mike | air |
Sarah | car |
Trevor | air |
Carl | car |
I'd like each person to use each transportation mode so an outcome as such
Name | Transportation |
---|---|
Mike | air |
Mike | car |
Sarah | air |
Sarah | car |
Trevor | air |
Trevor | car |
Carl | air |
Carl | car |
I tried creating a list then exploding the values but I was having issues adding a list as column values. What's the best way to go about this?
Upvotes: 1
Views: 1208
Reputation: 2436
for
import itertools
import pandas as pd
df = pd.DataFrame({
'Name': ['a', 'b', 'c', 'd'],
'Transportaion': ['t1', 't2', 't1', 't2']
})
that is:
Name Transportaion
0 a t1
1 b t2
2 c t1
3 d t2
This code:
pd.DataFrame([{'Name': t[0], 'Transportation': t[1]} for t in itertools.product(df['Name'].unique(), df['Transportaion'].unique())])
returns:
Name Transportation
0 a t1
1 a t2
2 b t1
3 b t2
4 c t1
5 c t2
6 d t1
7 d t2
Upvotes: 0
Reputation: 75080
Another way would be df.get
then unique then form a Multiindex.from_product
:
cols = ['Name','Transportation']
comb = pd.MultiIndex.from_product(map(pd.unique,map(df.get,df[cols])))
out = pd.DataFrame(comb.to_list(),columns=cols)
print(out)
Name Transportation
0 Mike air
1 Mike car
2 Sarah air
3 Sarah car
4 Trevor air
5 Trevor car
6 Carl air
7 Carl car
Or with levels on an index:
cols = ['Name','Transportation']
comb = pd.MultiIndex.from_product(pd.MultiIndex.from_frame(df[cols]).levels)
out = pd.DataFrame(comb.to_list(),columns=cols)
Upvotes: 1
Reputation: 14949
TRY:
df = pd.DataFrame([{'Name': n, 'Transportation':df.Transportation.unique()} for n in df.Name]).explode('Transportation', ignore_index=True)
OR:
df = df.assign(Transportation = df['Transportation'].apply(lambda x:df.Transportation.unique())).explode('Transportation', ignore_index=True)
OUTPUT:
Name Transportation
0 Mike air
1 Mike car
2 Sarah air
3 Sarah car
4 Trevor air
5 Trevor car
6 Carl air
7 Carl car
Upvotes: 1
Reputation: 323266
Let us do
out = pd.MultiIndex.from_product([df.Name.unique(),df.Transportation.unique()]).to_frame().reset_index(drop=True)
out.columns = df.columns
Out[161]:
0 1
0 Mike air
1 Mike car
2 Sarah air
3 Sarah car
4 Trevor air
5 Trevor car
6 Carl air
7 Carl car
Upvotes: 2