Reputation: 537
I am trying to figure out how to iterate through each value in this pandas DataFrame to see if it's absolute value is higher than some defined threshold ex:.01 >abs(value).
APP ENABLED DEVICES APPLE/MACINTOSH CN APPLECARE BARGAIN BOOKS
0 0.017685 0.000123 0.009362 0.039916
1 0.014884 0.009920 0.004747 -0.000653
2 -0.044820 -0.054319 0.001925 -0.179533
3 -0.014449 0.193068 -0.006028 -0.026057
4 0.047403 -0.046199 -0.047391 0.060473
I need the final output to look something like this
[{APP ENABLED DEVICES:0.017685,BARGAIN BOOKS:0.039916},
{APP ENABLED DEVICES:0.014884},
...]
So it would look like a list of dictionaries, with each row as a separate subsection. Only key value pairs, with a value higher than the defined threshold should be included in the list. Is this something that is possible? If so can somebody please walk me through how this could be done? Sorry, python is still relatively new for me... Thank you
P.S this is just a small part of the dataset. The total number of columns in the full dataframe is much larger so explicitly naming individual columns is unworkable.
Upvotes: 0
Views: 280
Reputation: 1718
The following example will give you the desired output and show some of the things you can do to manipulate the dataframe as well, without looping over each item.
import pandas as pd
pdf = pd.DataFrame.from_dict({'APP': [0.013, 0.42, -0.23, 0.06],
'BOOKS': [-1.3, 0.04, 0.54, 0.01],
'CN': [0.012, -0.03, 0.003, 0.5]})
abs_pdf = pdf.abs()
value = 0.01
sel = abs_pdf[abs_pdf > value]
results = list()
by_col = sel.T
for col in by_col:
results.append(by_col[col].dropna().to_dict())
print(results)
This will yield:
[{'APP': 0.012999999999999999, 'BOOKS': 1.3, 'CN': 0.012},
{'APP': 0.41999999999999998,
'BOOKS': 0.040000000000000001,
'CN': 0.029999999999999999},
{'APP': 0.23000000000000001, 'BOOKS': 0.54000000000000004},
{'APP': 0.059999999999999998, 'CN': 0.5}]
You should be able to change the code if you want different outputs.
Upvotes: 1
Reputation: 657
You probably don't really want to iterate, as pandas gives you much more efficient ways of doing most things. The first step will be to grab all the rows above the threshold. You could do that like so:
df = df[df["APPLE/MACINTOSH"] > 0.01]
Then you need to convert that to a dictionary using to_dict
Upvotes: 2