Reputation: 1100
Is there a way to get '&' or 'or' results for sorting multiple columns with string contains?
For example below if the genre column contains rock and demographics contain 30-50 and city contains Gainesville, I get the results that match those strings in the data frame.
Which I understand is the correct results genre column contains rock and or demographics contain 30-50 and/or city contains Gainesville I get the results that match those strings in the data frame?
This way to also add to the sparse results for example will populate 5 results in the .head(5) with the full matches plus the partial matches for the secondary requirements.
The genre has to contain the str rock and the demo may contain the str 35-50 and or the city contains the str Gainesville or not.
df[(df.genre.str.contains('rock')) & (df.demo.str.contains('35-50')) & (df.city.str.contains('Gainesville'))].head(4)
name genre demo price city
0 Alex Smith rock 18-25,25-35,35-50 100-500 Gainesville
4 Bob West rock 18-25,25-35,35-50 100-500 Gainesville
The desired result would be something like this
name genre demo price city
0 Alex Smith rock 18-25,25-35,35-50 100-500 Gainesville
4 Bob West rock 18-25,25-35,35-50 100-500 Gainesville
1 Mike Stevens pop, rock 18-25,25-35,35-50, 50+ 100-500 Somerville
6 Mary Porter jazz, rock 35-50 100-500, 500-100 Hendersonville
This is what I was able to
import pandas as pd
df = pd.DataFrame({'name':['Alex Smith','Mike Stevens','Brenda West','Holy Kent','Bob West','Sally May','Mary Porter','John Keys'],
'genre': ['rock','pop, rock','jazz',"dj",'rock','pop','jazz, rock',"dj"],'demo':['18-25,25-35,35-50','18-25,25-35,35-50, 50+','35-50','18-25','18-25,25-35,35-50','18-25,35-50, 50+','35-50','18-25'],
'price':['100-500','100-500','100-500, 500-100','1000+','100-500','100-500', '100-500, 500-100','1000+'],
'city':['Gainesville','Somerville','Hendersonville','Pluto','Gainesville','Somerville','Hendersonville','Pluto']})
df.head(10)
name genre demo price city
0 Alex Smith rock 18-25,25-35,35-50 100-500 Gainesville
1 Mike Stevens pop, rock 18-25,25-35,35-50, 50+ 100-500 Somerville
2 Brenda West jazz 35-50 100-500, 500-100 Hendersonville
3 Holy Kent dj 18-25 1000+ Pluto
4 Bob West rock 18-25,25-35,35-50 100-500 Gainesville
5 Sally May pop 18-25,35-50, 50+ 100-500 Somerville
6 Mary Porter jazz, rock 35-50 100-500, 500-100 Hendersonville
7 John Keys dj 18-25 1000+ Pluto
This gives me the result of the keyword rock in genre and demo of 30-50
# This gives me the result of the keyword rock in genre and demo of 30-50
df[(df.genre.str.contains('rock')) & (df.demo.str.contains('35-50'))]
name genre demo price city
0 Alex Smith rock 18-25,25-35,35-50 100-500 Gainesville
1 Mike Stevens pop, rock 18-25,25-35,35-50, 50+ 100-500 Somerville
4 Bob West rock 18-25,25-35,35-50 100-500 Gainesville
6 Mary Porter jazz, rock 35-50 100-500, 500-100 Hendersonville
**This gives me the result of the keyword rock in genre and demo of 30-50**
# This limits the results to 3 of the keyword in genre and demo of 30-50
df[(df.genre.str.contains('rock')) & (df.demo.str.contains('35-50'))].head(3)
name genre demo price city
0 Alex Smith rock 18-25,25-35,35-50 100-500 Gainesville
1 Mike Stevens pop, rock 18-25,25-35,35-50, 50+ 100-500 Somerville
4 Bob West rock 18-25,25-35,35-50 100-500 Gainesville
This gives me results for rock with demo 30-50 sorted by city
# best results for rock with demo 30-50 sorted by city by 3
df[(df.genre.str.contains('rock')) & (df.demo.str.contains('35-50')) & (df.city.str.contains('Gainesville'))].head(3)
name genre demo price city
0 Alex Smith rock 18-25,25-35,35-50 100-500 Gainesville
4 Bob West rock 18-25,25-35,35-50 100-500 Gainesville
Upvotes: 0
Views: 44
Reputation: 323226
Check with or not and
df[(df.genre.str.contains('rock')) | (df.demo.str.contains('35-50')) | (df.city.str.contains('Gainesville'))].head(4)
Upvotes: 2