Pandas: How to get all values for a column, where another column's value is a specific value

Question

I have a dataframe which contains a sample_id and mutation: Each sample contains several mutations

sample_id    mutation
sample1      mutation_A
sample1      mutation_B
sample1      mutation_D

sample2      mutation_C
sample2      mutation_D

sample3      mutation_A
sample3      mutation_B
sample3      mutation_C

I want to be able to obtain the values where say, mutation_C exists. However I want to get all the results out for that sample -

df.loc[(df[mutation] == 'mutation_C')]

returns:

sample_id    mutation
sample2      mutation_C

How do I get the rest of sample2 mutation data, so:

sample_id    mutation
sample2      mutation_C
sample2      mutation_D

I have been trying to use grouopby but can't figure out how to obtain all the results

jezrael · Accepted Answer

First filter all samples and then filter again by isin:

a = df.loc[df['mutation'] == 'mutation_C', 'sample_id']
df = df[df['sample_id'].isin(a)]
print (a)

3    sample2
7    sample3
Name: sample_id, dtype: object

df = df[df['sample_id'].isin(a)]
print (df)
  sample_id    mutation
3   sample2  mutation_C
4   sample2  mutation_D
5   sample3  mutation_A
6   sample3  mutation_B
7   sample3  mutation_C

Pandas: How to get all values for a column, where another column's value is a specific value

Answers (2)

Related Questions

Pandas: How to get all values for a column, where another column&#39;s value is a specific value

Answers (2)

Related Questions

Pandas: How to get all values for a column, where another column's value is a specific value