Reputation: 485
I want to use data from one dataset to filter data in other one. My data looks like:
ls.head()
UserID Rating ISBN13 GoodreadsID Title Author
18266 2146832 5 0060525592 275000 Fire and Ice (Warriors, #2) Erin Hunter
18267 2119385 5 0060525592 275000 Fire and Ice (Warriors, #2) Erin Hunter
18272 2117173 5 0060525592 275000 Fire and Ice (Warriors, #2) Erin Hunter
18273 2117009 5 0060525592 275000 Fire and Ice (Warriors, #2) Erin Hunter
18274 2106234 5 0060525592 275000 Fire and Ice (Warriors, #2) Erin Hunter
and my second dataset:
data.head()
UserID Rating ISBN13 GoodreadsID Title Author
150834 2131509 5 0143038419 19501 Eat, Pray, Love Elizabeth Gilbert
59347 2113561 5 0374528373 4934 The Brothers Karamazov Fyodor Dostoyevsky
37087 2122197 5 0316015849 41865 Twilight (Twilight, #1) Stephenie Meyer
950201 2107691 3 044619817X 5931169 Santa Olivia (Santa Olivia, #1) Jacqueline Carey
114404 2144218 3 0441015891 2233407 From Dead to Worse (Sookie Stackhouse, #8) Charlaine Harris
1053208 2143953 4 0451463099 6582703 Unknown (Outcast Season, #2) Rachel Caine
18290 2148946 5 0060525592 275000 Fire and Ice (Warriors, #2) Erin Hunter
1585865 2140143 3 1594742812 4133366 The Curious Case of Benjamin Button: A Graphic... Nunzio DeFilippis
1115470 2125069 0 0758234937 11796251 Cinnamon Roll Murder (Hannah Swensen, #15) Joanne Fluke
484235 2108848 5 0553816713 15931 The Notebook (The Notebook, #1) Nicholas Sparks
I want to use column ls['UserID'] and select all those user from the second dataset ('data').
I've tried:
data.loc[ls['UserID'] == data['UserID']]
...which gives me
ValueError: Can only compare identically-labeled Series objects
I've tried both sort_index()
and reset_index()
.
Help would be appreaciated. Thanks!
Upvotes: 0
Views: 49
Reputation: 104
Try this:
df_merge = pd.merge(ls, data, on='UserID')
you can also use the left_on and right_on parameters of that method (pandas merge)
Upvotes: 2