Reputation: 1060
I am trying to get a value situated on the third column from a pandas dataframe by knowing the values of interest on the first two columns, which point me to the right value to fish out. I do not know the row index, just the values I need to look for on the first two columns. The combination of values from the first two columns is unique, so I do not expect to get a subset of the dataframe, but only a row. I do not have column names and I would like to avoid using them.
Consider the dataframe df
:
a 1 bla
b 2 tra
b 3 foo
b 1 bar
c 3 cra
I would like to get tra
from the second row, based on the b
and 2
combination that I know beforehand. I've tried subsetting with
df = df.loc['b', :]
which returns all the rows with b
on the same column (provided I've read the data with index_col = 0
) but I am not able to pass multiple conditions on it without crashing or knowing the index of the row of interest. I tried both df.loc
and df.iloc
.
In other words, ideally I would like to get tra
without even using row indexes, by doing something like:
df[(df[,0] == 'b' & df[,1] == `2`)][2]
Any suggestions? Probably it is something simple enough, but I have the tendency to use the same syntax as in R, which apparently is not compatible.
Thank you in advance
Upvotes: 2
Views: 1649
Reputation: 1060
As @anky has suggested, a way to do this without knowing the column names nor the row index where your value of interest is, would be to read the file in a pandas dataframe using multiple column indexing.
For the provided example, knowing the column indexes at least, that would be:
df = pd.read_csv(path, sep='\t', index_col=[0, 1])
then, you can use:
df = df.iloc[df.index.get_loc(("b", 2)):]
df.iloc[0]
to get the value of interest.
Thanks again @anky for your help. If you found this question useful, please upvote @anky 's comment in the posted question.
Upvotes: 2
Reputation: 3857
I'd probably use pd.query
for that:
import pandas as pd
df = pd.DataFrame(index=['a', 'b', 'b', 'b', 'c'], data={"col1": [1, 2, 3, 1, 3], "col2": ['bla', 'tra', 'foo', 'bar', 'cra']})
df
col1 col2
a 1 bla
b 2 tra
b 3 foo
b 1 bar
c 3 cra
df.query('col1 == 2 and col2 == "tra"')
col1 col2
b 2 tra
Upvotes: 1