Reputation: 391
I have two dataframes,df1:
name mark
0 Alex [Tue, 0.0, 10, 0.0, 0.0]
1 John [Tue, 0.0, 10, 0.0, 0.0]
2 Tom [Tue, 0.0, 10, 0.0, 0.0]
3 Tim [Tue, 0.0, 10, 0.0, 0.0]
and df2:
name mark1
0 Alex [11.0, 0.0, 1.0]
1 John [12.0, 0.0, 4.0]
2 Tom [12.0, 0.0, 4.0]
and when i do :
merged = pd.merge(df1,df2,how='outer',on='name').fillna(0)
i expected to get something like this:
name mark mark1
0 Alex [Tue, 0.0, 10, 0.0, 0.0] [11.0, 0.0, 1.0]
1 John [Tue, 0.0, 10, 0.0, 0.0] [12.0, 0.0, 4.0]
2 Tom [Tue, 0.0, 10, 0.0, 0.0] [12.0, 0.0, 4.0]
3 Tim [Tue, 0.0, 10, 0.0, 0.0] 0
but i got something like this(which looks more like concat):
name mark mark1
0 Alex [Tue, 0.0, 10, 0.0, 0.0] 0
1 John [Tue, 0.0, 10, 0.0, 0.0] 0
2 Tom [Tue, 0.0, 10, 0.0, 0.0] 0
3 Tim [Tue, 0.0, 10, 0.0, 0.0] 0
4 Alex 0 [11.0, 0.0, 1.0]
5 John 0 [12.0, 0.0, 4.0]
6 Tom 0 [12.0, 0.0, 4.0]
can someone please tell me what wrong am i doing? so this is my whole code :
name mark
0 Alex [Mon, 10.12, 12, 10.0, 17.0]
1 Alex [Wed, 10.12, 15, 10.0, 17.0]
2 Alex [Fri, 10.12, 7, 10.0, 17.0]
3 Alex [Tue, 0.0, 10, 0.0, 0.0]
4 Alex [Thu, 0.0, 16, 0.0, 0.0]
5 Alex [Sat, 0.0, 2, 0.0, 0.0]
6 Alex [Sun, 0.0, 12, 0.0, 0.0]
7 John [Fri, 10.12, 7, 10.0, 17.0]
8 John [Mon, 10.12, 12, 10.0, 17.0]
9 John [Tue, 0.0, 10, 0.0, 0.0]
10 John [Wed, 0.0, 15, 0.0, 0.0]
11 John [Thu, 0.0, 16, 0.0, 0.0]
12 John [Sat, 0.0, 2, 0.0, 0.0]
13 John [Sun, 0.0, 12, 0.0, 0.0]
14 Tom [Wed, 10.12, 15, 10.0, 17.0]
15 Tom [Mon, 10.12, 12, 10.0, 17.0]
16 Tom [Fri, 10.12, 7, 10.0, 17.0]
17 Tom [Tue, 0.0, 10, 0.0, 0.0]
18 Tom [Thu, 0.0, 16, 0.0, 0.0]
19 Tom [Sat, 0.0, 2, 0.0, 0.0]
20 Tom [Sun, 0.0, 12, 0.0, 0.0]
21 Tim [Mon, 10.12, 12, 10.0, 17.0]
22 Tim [Fri, 10.12, 7, 10.0, 17.0]
23 Tim [Tue, 0.0, 10, 0.0, 0.0]
24 Tim [Wed, 0.0, 15, 0.0, 0.0]
25 Tim [Thu, 0.0, 16, 0.0, 0.0]
26 Tim [Sat, 0.0, 2, 0.0, 0.0]
27 Tim [Sun, 0.0, 12, 0.0, 0.0]
and then i do :
df = (df.groupby(['name'])['mark'].apply(list).apply(lambda x: [list(y) for y
in set([tuple(j) for j in x])]).reset_index())
which gives me :
name mark
0 Alex [[Tue, 0.0, 10, 0.0, 0.0], [Sun, 0.0, 12, 0.0,...
1 John [[Tue, 0.0, 10, 0.0, 0.0], [Sun, 0.0, 12, 0.0,...
2 Tom [[Tue, 0.0, 10, 0.0, 0.0], [Sun, 0.0, 12, 0.0,...
3 Tim [[Tue, 0.0, 10, 0.0, 0.0], [Sun, 0.0, 12, 0.0,...
second dataframe is also obtained by doing something like this.(sorry for not placing the exact dataframe cuz it was kind of messy)
Upvotes: 1
Views: 36
Reputation: 26676
Merge left, on column 'name'
df1.merge(df2, how='left',on='name')
name mark mark1
0 Alex [Tue, 0.0, 10, 0.0, 0.0] [11.0, 0.0, 1.0]
1 John [Tue, 0.0, 10, 0.0, 0.0] [12.0, 0.0, 4.0]
2 Tom [Tue, 0.0, 10, 0.0, 0.0] [12.0, 0.0, 4.0]
3 Tim [Tue, 0.0, 10, 0.0, 0.0] NaN
Upvotes: 1
Reputation: 159
The following should work:
merged = df1.merge(df2, how='left',on='name').fillna(0)
This is because in your merge you are performing a full outer join.
Upvotes: 1