Harry Maguire
Harry Maguire

Reputation: 153

Pandas - How do I look for a set of values in a column and if it is present return a value in another column

I am new to pandas. I have a csv file which has a latitude and longitude columns and also a tile ID column, the file has around 1 million rows. I have a list of around a hundred tile ID's and want to get the latitude and longitude coordinates for these tile ID's. Currently I have:

good_tiles_str = [str(q) for q in good_tiles]#setting list elements to string data type
file['tile'] = file.tile.astype(str)#setting title column to string data type

for i in range (len(good_tiles_str)):
     x = good_tiles_str[i]
     lat = file.loc[file['tile'].str.contains(x), 'BL_Latitude'] #finding lat coordinates
     long = file.loc[file['tile'].str.contains(x), 'BL_Longitude'] #finding long coordinates
print(lat)
print(long)

This method is very slow and I know it is not the correct way as I heard you should not use for loops like this whilst using pandas. Also, it does not work as it doesn't find all the latitude and longitude points for the tile ID's

Any help would be very gladly appreciated

Upvotes: 0

Views: 136

Answers (2)

Ashish Acharya
Ashish Acharya

Reputation: 3399

Try this:

search_for = '|'.join(good_tiles_str)
good = file[file.tile.str.contains(search_for)]
good = good[['BL_Latitude', 'BL_Longitude']].drop_duplicates()

Upvotes: 0

Abhishek Singh
Abhishek Singh

Reputation: 113

There is no need to iterate rows explicitly , I think as far as I understood your question.

If you wish a particular assignment given a condition, you can do so explicitly. Here's one way using numpy.where; we use ~ to indicate "negative".

rule1= file['tile'].str.contains(x)
rule2= file['tile'].str.contains(x)

file['flag'] = np.where(rule1 , 'BL_Latitude', " " )
file['flag'] = np.where(rule2 & ~rule1, 'BL_Longitude', file['flag'])

Upvotes: 1

Related Questions