Reputation: 245
In python's pandas, suppose there is a DataFrame in which the values in one of the columns is a string.
df = pd.DataFrame({'State':['California','Oregon','Washington'],\
'Cities':['Los Angeles, Oakland, San Diego','Portland, Eugene','Seattle, Spokane']})
How would you select rows matching one value in any of the string within a column? For example, how would return just the rows that have 'Los Angeles' as one of the cities?
My first thought is to loop through each row in the DataFrame, and then use string manipulation ( .split(',') ) to break up each string (this also does not seem efficient with very large datasets). However, I'm not sure where to go from there to actually select that row.
Upvotes: 3
Views: 5623
Reputation: 7179
Following from Woody Pride's comment:
To get one city:
df[df.Cities == 'Los Angeles']
>>>
Empty DataFrame
Columns: [Cities, State]
Index: []
For string containing potentially multiple cities:
df[df.Cities.str.contains('Los Angeles')]
>>>
Cities State
0 Los Angeles, Oakland, San Diego California
See docs.
Upvotes: 3