user3590113
user3590113

Reputation: 537

Iterating through a Pandas Dataframe

I am trying to figure out how to iterate through each value in this pandas DataFrame to see if it's absolute value is higher than some defined threshold ex:.01 >abs(value).

  APP ENABLED DEVICES  APPLE/MACINTOSH      CN  APPLECARE  BARGAIN BOOKS
0             0.017685                 0.000123   0.009362       0.039916
1             0.014884                 0.009920   0.004747      -0.000653
2            -0.044820                -0.054319   0.001925      -0.179533
3            -0.014449                 0.193068  -0.006028      -0.026057
4             0.047403                -0.046199  -0.047391       0.060473

I need the final output to look something like this

[{APP ENABLED DEVICES:0.017685,BARGAIN BOOKS:0.039916},
 {APP ENABLED DEVICES:0.014884},
 ...]

So it would look like a list of dictionaries, with each row as a separate subsection. Only key value pairs, with a value higher than the defined threshold should be included in the list. Is this something that is possible? If so can somebody please walk me through how this could be done? Sorry, python is still relatively new for me... Thank you

P.S this is just a small part of the dataset. The total number of columns in the full dataframe is much larger so explicitly naming individual columns is unworkable.

Upvotes: 0

Views: 280

Answers (2)

chris-sc
chris-sc

Reputation: 1718

The following example will give you the desired output and show some of the things you can do to manipulate the dataframe as well, without looping over each item.

import pandas as pd
pdf = pd.DataFrame.from_dict({'APP': [0.013, 0.42, -0.23, 0.06],
                              'BOOKS': [-1.3, 0.04, 0.54, 0.01],
                              'CN': [0.012, -0.03, 0.003, 0.5]})
abs_pdf = pdf.abs()
value = 0.01
sel = abs_pdf[abs_pdf > value]

results = list()
by_col = sel.T
for col in by_col:
    results.append(by_col[col].dropna().to_dict())
print(results)

This will yield:

[{'APP': 0.012999999999999999, 'BOOKS': 1.3, 'CN': 0.012},
 {'APP': 0.41999999999999998,
  'BOOKS': 0.040000000000000001,
  'CN': 0.029999999999999999},
 {'APP': 0.23000000000000001, 'BOOKS': 0.54000000000000004},
 {'APP': 0.059999999999999998, 'CN': 0.5}]

You should be able to change the code if you want different outputs.

Upvotes: 1

Cody Braun
Cody Braun

Reputation: 657

You probably don't really want to iterate, as pandas gives you much more efficient ways of doing most things. The first step will be to grab all the rows above the threshold. You could do that like so:

df = df[df["APPLE/MACINTOSH"] > 0.01]

Then you need to convert that to a dictionary using to_dict

Upvotes: 2

Related Questions