Reputation: 20119

Return name of all columns matching a condition in pandas

I have the following dataframe:

           11NLI        11SEP       14COS       15COS      15NTF
11NLI   0.000000    98.479289   99.649427   87.732653   0.360731
11SEP   98.479289   0.000000    15.129470   13.901060   98.469187
14COS   99.649427   15.129470   0.000000    13.073843   99.694179
15COS   87.732653   13.901060   13.073843   0.000000    87.756845
15NTF   0.360731    98.469187   99.694179   87.756845   0.000000

These are distances. Now I need to have, for each row, a dictionary storing the column names that match the condition that the distance (the cell value) is <= 1. For example (and I can miss some, but), I expect the following:

{"11NLI":["11NLI","15NTF"], 
 "11SEP":["11SEP"],
 "14COS":["14COS"],
 "15NTF":["11NLI","15NTF"]}

So, assuming I have to iterate over rows, how do I return the name of the columns for which the condition that the value is <= 1? Maybe it's better to find the rows?

Upvotes: 3

Answers (3)

ansev

Reputation: 30920

Use:

(df.where(df.le(1))
   .reset_index()
   .melt('index')
   .dropna()
   .groupby('index')['variable']
   .agg(list).to_dict())

Output

{'11NLI': ['11NLI', '15NTF'],
 '11SEP': ['11SEP'],
 '14COS': ['14COS'],
 '15COS': ['15COS'],
 '15NTF': ['11NLI', '15NTF']}

Upvotes: 4

Umar.H

Reputation: 23099

We can use melt and groupby with query to match your condition:

df_dict = df.reset_index().melt(id_vars="index").query("value < 1").groupby("variable")[
    "index"
].agg(list).to_dict()


print(df_dict)

{'11NLI': ['11NLI', '15NTF'],
 '11SEP': ['11SEP'],
 '14COS': ['14COS'],
 '15COS': ['15COS'],
 '15NTF': ['11NLI', '15NTF']}

Upvotes: 2

BENY

Reputation: 323226

First create your condition , then we do stack with groupby

s=df.where(df.gt(0)).stack()
d=s.reset_index().groupby('level_0')['level_1'].agg(list).to_dict()
d
Out[151]: 
{'11NLI': ['11SEP', '14COS', '15COS', '15NTF'],
 '11SEP': ['11NLI', '14COS', '15COS', '15NTF'],
 '14COS': ['11NLI', '11SEP', '15COS', '15NTF'],
 '15COS': ['11NLI', '11SEP', '14COS', '15NTF'],
 '15NTF': ['11NLI', '11SEP', '14COS', '15COS']}

Upvotes: 1

Return name of all columns matching a condition in pandas

Answers (3)

Related Questions