Reputation: 11
I have the following dataset for Manhattan neighborhoods with the most common venues in each neighborhood:
I made a list of venues:
fit_venues = ['Coffee Shop', 'Café', 'Park', 'Hotel', 'Sandwich Place', 'Pizza Place', 'Gym / Fitness Center', 'Exhibit', 'Gym', 'Supermarket', 'Nightclub', 'Concert Hall', 'Jazz Club']
and I want to add a column to the dataframe (let's call it "Fit Neighborhood" for example), and compare the most common venues of each neighborhood (5 columns) with the list "fit_venues". Then we assign the result to the column "Fit Neighborhood" (Yes/No or True/False). For example, the first two rows should return Yes/True and the third row should return No/False.
Any help?
Upvotes: 1
Views: 6325
Reputation: 1474
Have you tried using DataFrame.isin()
?
You didn't give me the names of your most common venue columns, so I'll assume they are the only columns in the DataFrame (df
):
fit_venues = ['Coffee Shop', 'Café', 'Park', 'Hotel', 'Sandwich Place', 'Pizza Place', 'Gym / Fitness Center', 'Exhibit', 'Gym', 'Supermarket', 'Nightclub', 'Concert Hall', 'Jazz Club']
df['Fit Neighborhood'] = df.isin(fit_venues).any()
Upvotes: 1
Reputation: 523
See if this works:
fit_venues = ['Coffee Shop', 'Café', 'Park', 'Hotel', 'Sandwich Place', 'Pizza Place', 'Gym / Fitness Center', 'Exhibit', 'Gym', 'Supermarket', 'Nightclub', 'Concert Hall', 'Jazz Club']
df["binary_check"] = df[df["5th Most Common Venue"].isin(fit_venues)]
Upvotes: 1