Iterating through a Pandas Dataframe

Question

I am trying to figure out how to iterate through each value in this pandas DataFrame to see if it's absolute value is higher than some defined threshold ex:.01 >abs(value).

  APP ENABLED DEVICES  APPLE/MACINTOSH      CN  APPLECARE  BARGAIN BOOKS
0             0.017685                 0.000123   0.009362       0.039916
1             0.014884                 0.009920   0.004747      -0.000653
2            -0.044820                -0.054319   0.001925      -0.179533
3            -0.014449                 0.193068  -0.006028      -0.026057
4             0.047403                -0.046199  -0.047391       0.060473

I need the final output to look something like this

[{APP ENABLED DEVICES:0.017685,BARGAIN BOOKS:0.039916},
 {APP ENABLED DEVICES:0.014884},
 ...]

So it would look like a list of dictionaries, with each row as a separate subsection. Only key value pairs, with a value higher than the defined threshold should be included in the list. Is this something that is possible? If so can somebody please walk me through how this could be done? Sorry, python is still relatively new for me... Thank you

P.S this is just a small part of the dataset. The total number of columns in the full dataframe is much larger so explicitly naming individual columns is unworkable.

chris-sc · Accepted Answer

The following example will give you the desired output and show some of the things you can do to manipulate the dataframe as well, without looping over each item.

import pandas as pd
pdf = pd.DataFrame.from_dict({'APP': [0.013, 0.42, -0.23, 0.06],
                              'BOOKS': [-1.3, 0.04, 0.54, 0.01],
                              'CN': [0.012, -0.03, 0.003, 0.5]})
abs_pdf = pdf.abs()
value = 0.01
sel = abs_pdf[abs_pdf > value]

results = list()
by_col = sel.T
for col in by_col:
    results.append(by_col[col].dropna().to_dict())
print(results)

This will yield:

[{'APP': 0.012999999999999999, 'BOOKS': 1.3, 'CN': 0.012},
 {'APP': 0.41999999999999998,
  'BOOKS': 0.040000000000000001,
  'CN': 0.029999999999999999},
 {'APP': 0.23000000000000001, 'BOOKS': 0.54000000000000004},
 {'APP': 0.059999999999999998, 'CN': 0.5}]

You should be able to change the code if you want different outputs.

Iterating through a Pandas Dataframe

Answers (2)

Related Questions