Selecting values with a threshold pandas dataframe

Question

I would like to make a list of the columns which for each of those index = [0,43,46] the values are higher than a certain value, for example, 1000.

My data frame looks like this but with more columns.

index           asset.assetStateId.10  asset.assetStateId.100  asset.assetStateId.101  \
0.0                  1057.0                     0.0                     0.0   
43.0                  380.0                  1441.0                     0.0   
46.0                    0.0                     0.0                  1441.0

So, in the end, I would like to have three lists which the name of the column which the values are higher than 1000 for each row.

Thanks in advance,

I have tried so far this:

lista_colunas = list(result_1.columns.values)
lista_assets = []
for asset in lista_colunas:
    if result_1.loc[result_1[asset]>1000]:
        lista_assets += [asset]

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

jezrael · Accepted Answer

I think you can create dictionary of lists:

First select rows by loc if necessary, then reshape by stack and reset_index:

idx  = [0,43,46] 
df = df.loc[idx].stack().reset_index(name='val')
print (df)    
   Index                 level_1     val
0      0   asset.assetStateId.10  1057.0
1      0  asset.assetStateId.100     0.0
2      0  asset.assetStateId.101     0.0
3     43   asset.assetStateId.10   380.0
4     43  asset.assetStateId.100  1441.0
5     43  asset.assetStateId.101     0.0
6     46   asset.assetStateId.10     0.0
7     46  asset.assetStateId.100     0.0
8     46  asset.assetStateId.101  1441.0

Then filter by boolean indexing, groupby and create lists, last convert to dict:

a = df[df['val'] > 1000].groupby('index')['level_1'].apply(list).to_dict()

Another solution is loop by each row and by condition create lists:

a = df.loc[idx].apply(lambda x: x.index[x > 1000].tolist(), axis=1).to_dict()

print (a)
{0.0: ['asset.assetStateId.10'], 
43.0: ['asset.assetStateId.100'], 
46.0: ['asset.assetStateId.101']}

print (a[0])
['asset.assetStateId.10']

print (a[43])
['asset.assetStateId.100']

Selecting values with a threshold pandas dataframe

Answers (2)

Related Questions