Reputation: 95
Lets say I have a data frame:
df = pd.DataFrame({"a": range(1,5), "b": range(6, 10), "c": range(11, 15) , "d": range(15, 19)})
I want to filter this data frame based on the values of two columns which make up coordinate points. Say c, d are the x, and y coordinates respectively. However, I want to check if given the list of points in the data frame, which points fall within the values of a list of x coordinates and a list of y coordinates.
x_coord = [4,12,13,17,19]
y_coord = [16,18,25,29,32]
Using the "isin" function of pandas, how can I parse both the c and d columns of the data frame simultaneously and check them against the values in a list? (I want to be able to use this parsing method for large data frames)
Output wanted: data frame containing the entire row of the original data frame that has both c & d values that are in both x & y lists.
Upvotes: 0
Views: 161
Reputation: 555
You can do this by creating a new column as a tuple of the other two and using isin
in that column as follows:
In[0]: df['coords'] = list(zip(df['c'], df['d']))
: df[df['coords'].isin(zip(x_coord, y_coord))]
Out[0]:
a b c d e
0 1 6 11 15 NaN
1 2 7 12 16 NaN
2 3 8 13 17 NaN
3 4 9 14 18 NaN
Or you can create a new dataframe with your coordinates and use a inner_join method to get only rows that match.
In[0]: df = pd.DataFrame({"a": range(1,5), "b": range(6, 10), "c": range(11, 15) , "d": range(15, 19), "e": np.nan})
: x_coord = range(11, 15)
: y_coord = range(15, 19)
: coords = pd.DataFrame(list(zip(x_coord, y_coord)), columns=['c', 'd'])
: df.merge(coords, on=['c', 'd'], how='inner')
Out[0]:
a b c d e
0 1 6 11 15 NaN
1 2 7 12 16 NaN
2 3 8 13 17 NaN
3 4 9 14 18 NaN
Upvotes: 1