Jael
Jael

Reputation: 369

Get values of two different columns based on a condition in a third column

I have a certain condition (Incident = yes) and I want to know the values in two columns fulfilling this condition. I have a very big data frame (many rows and many columns) and I am looking for a "screening" function.

To illustrate the following example with the df (which has many more columns than shown):

Repetition  Step    Incident   Test1   Test2
1            1         no         10     20
1            1         no         9      11
1            2         yes        9      19
1            2         yes        11     20
1            2         yes        12     22
1            3         yes        9      18
1            3         yes        8      18

What I would like to get as an answer is

Repetition   Step
1              2
1              3

If I only wanted to know the Step, I would use the following command:

df[df.Incident == 'yes'].Step.unique()

Is there a similar command to get the values of two columns for a specific condition?

Thanks for the help! :-)

Upvotes: 0

Views: 431

Answers (1)

sammywemmy
sammywemmy

Reputation: 28659

You could use the query option for the condition, select the interested columns, and finally remove duplicate values

 df.query('Incident=="yes"').filter(['Repetition','Step']).drop_duplicates()

OR

you could use the Pandas' loc method, set the rows as the condition, set the columns part with the columns you are interested in, then drop the duplicates.

df.loc[df.Incident=="yes",['Repetition','Step']].drop_duplicates()

Upvotes: 1

Related Questions