Reputation: 1741
Is it possible to do a right join where key is null in python pandas. That is, can I join DateFrames to produce only values from the right that do not match the left?
Upvotes: 5
Views: 8532
Reputation: 49886
I think this is best expressed as an index selection operation. To find all indices in one frame and not in another, try using the -
operator on the two Dataframe index
objects, as if the index objects were built-in python set objects. for example:
In [1]: dfa = pd.DataFrame({'A': range(5)}, index=range(5))
In [2]: dfb = pd.DataFrame({'A': range(10, 15)}, index=range(3,8))
In [3]: dfa
Out[3]:
A
0 0
1 1
2 2
3 3
4 4
In [4]: dfb
Out[4]:
A
3 10
4 11
5 12
6 13
7 14
In [5]: dfb.loc[set(dfb.index) - set(dfa.index)]
Out[5]:
A
5 12
6 13
7 14
Upvotes: 1
Reputation: 2503
Use https://pandas.pydata.org/docs/reference/api/pandas.Index.difference.html#pandas.Index.difference
import pandas a pd
dfa = pd.DataFrame({'A': range(5)}, index=range(5))
Out[13]:
A
0 0
1 1
2 2
3 3
4 4
dfb = pd.DataFrame({'A': range(10, 15)}, index=range(3,8))
Out[14]:
A
3 10
4 11
5 12
6 13
7 14
dfb[dfb.index.difference(dfa.index)]
Out[15]:
A
5 12
6 13
7 14
Upvotes: 0
Reputation: 308
A more compact solution with merge:
df = dfa.merge(dfb, indicator='i', how='outer').query('i == "right_only"').drop('i', 1)
print (df)
Solution taken from here:
https://stackoverflow.com/a/49487564/9539462
Upvotes: 1
Reputation: 1734
You could make a dummy column of ones in each dataframe, use pd.merge() to right join the dataframes, and then use a mask to filter out the nulls.
In [1]: dfa = pd.DataFrame({'A': range(4, 8)})
In [2]: dfb = pd.DataFrame({'A': range(6, 10)})
In [3]: dfa['ones'] = 1
In [4]: dfb['ones'] = 1
In [5]: dfa
Out[5]:
A ones
0 4 1
1 5 1
2 6 1
3 7 1
In [6]: dfb
Out[6]:
A ones
0 6 1
1 7 1
2 8 1
3 9 1
In [7]: df = pd.merge(dfa, dfb, on = 'A', how = 'right')
In [8]: df
Out[8]:
A ones_x ones_y
0 6 1 1
1 7 1 1
2 8 NaN 1
3 9 NaN 1
In [8]: df[df.ones_x.notnull()][['A']]
Out[8]:
A
0 6
1 7
Upvotes: 1