Reputation: 643
I have read in a bunch of data into my dataframe (df) from a CSV file. One of the fields is a GeoLocation (longitude, latitude) and I wish to slice out certain rows where the longitude is between 37 and 40. The CSV stores the geolocation in a column with (longitude, latitude) I am having trouble using the 'df.where()' function
geo = df.where(df['GeoLocation'][0] < 40 & df['GeoLocation'][0] > 37)
This keeps throwing error saying
TypeError: 'str' object cannot be interpreted as an integer
What am i doing wrong when i am trying to slice the column?
Here is the code i used to pull in the data
df = pd.concat([x for x in pd.read_csv('U.S._Chronic_Disease_Indicators__CDI_.csv', chunksize=1000)], ignore_index=True)'
Upvotes: 1
Views: 668
Reputation: 11347
You want to split up the series first, then do a filter
df[['lat', 'long']] = df['GeoLocation'].str.split(',', expand=True).astype(float)
geo = df[(df['lat'] < 40) & (df['long'] > 37)]
Note that the [(x) & (y)] is very picky about you explicitly having all the parentheses.
Upvotes: 2