Reputation: 13
I have a DataFrame that looks like this below:
ID | Clicks | Clicks_GA | Discrep_% | Discrep_Found |
---|---|---|---|---|
5939 | 18482 | 18480 | .01 | False |
#Calculates the discrepancy % (I also import numpy as np)
df['Discrep_%'] = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)
#Returns true or false if the discrepancy is less than the abs value of 5%
df['Discrep_Found'] = (df['Discrep_%'] > .05)
The problem is that I have multiple dataframes, and I don't want to copy and paste the same line of code a bunch of times.
Is there a function I can use to make this process simpler?
Thanks!
Upvotes: 0
Views: 37
Reputation: 2804
Try this:
def count_some(df):
val = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)
return val, val > .05
df[["Discrep_%", "Discrep_Found"]] = df.apply(count_some, axis=1, result_type='expand')
Upvotes: 1
Reputation: 19565
You could loop through the DataFrames. For example:
for df in [df1, df2, df3, ...]:
df['Discrep_%'] = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)
df['Discrep_Found'] = (df['Discrep_%'] > .05)
Upvotes: 0