Reputation: 1005
I want to search a dataframe to find a value correspond to index row and column value. However, I am struggling because the column header values are the upper bound of a range where the previous column header value is the lower bound ( but the uppper bound of its other neightbor column value).
I cannot find a way to match my input with the column corresponding to the lower bound of the range.
with an example it is very easy to see:
data_frame_test = pd.DataFrame({'location' : [1, 2, 'S'],
'200' : [342, 690, 103],
'1000' : [322, 120, 193],
'2000' : [249, 990, 403]})
data_frame_test = data_frame_test.set_index('location')
and what I want to do is this
location = 1
weight = 500
output = data_frame_test.iloc[1][??] #must be equal to 342
see, the column where weight must look into is 200, because it is in the range between ]200;1000[. I don't know what else to try to translate that into python code. Any help would be greatly appreciated.
Upvotes: 1
Views: 414
Reputation: 863226
First convert columns to integers by rename
if necessary:
data_frame_test = data_frame_test.set_index('location').rename(columns=int)
print (data_frame_test)
200 1000 2000
location
1 342 322 249
2 690 120 990
S 103 193 403
weight = 500
location = 1
And then match values by positions with DataFrame.loc
with last position of True
value by compare by less values like weight:
#https://stackoverflow.com/a/8768734
b = (data_frame_test.columns[::-1] < weight)
pos = len(b) - np.argmax(b) - 1
print (pos)
0
output = data_frame_test.loc[location, data_frame_test.columns[pos]]
print (output)
342
Or you can use DataFrame.iloc
with position by Index.get_loc
:
output = data_frame_test.iloc[data_frame_test.index.get_loc(location), pos]
print (output)
342
Upvotes: 1
Reputation: 88276
I can think of cansting the columns to int
, and indexing the dataframe using Series.searchsorted
, and boolean indexing on the index:
location = 1
weight = 500
data_frame_test.iloc[data_frame_test.index==location,
data_frame_test.columns.astype('int').searchsorted(weight)-1]
location
1 342
Name: 200, dtype: int64
Upvotes: 1
Reputation: 590
You can make a custom function which will iterate over columns to check correct column and then return cell of that location:
def get_val(location, weight, df):
col = df.columns[0]
for c in df.columns:
if weight >= int(c):
col = c
else:
break
return df.loc[location, col]
get_val(1, 500, data_frame_test)
Upvotes: 1