Justin
Justin

Reputation: 13

How can I create a function to make this step easier in Pandas?

I have a DataFrame that looks like this below:

ID Clicks Clicks_GA Discrep_% Discrep_Found
5939 18482 18480 .01 False

#Calculates the discrepancy % (I also import numpy as np)

df['Discrep_%'] = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA'] * 100)

#Returns true or false if the discrepancy is less than the abs value of 5%

df['Discrep_Found'] = (df['Discrep_%'] > .05)

The problem is that I have multiple dataframes, and I don't want to copy and paste the same line of code a bunch of times.

Is there a function I can use to make this process simpler?

Thanks!

Upvotes: 0

Views: 37

Answers (2)

dimay
dimay

Reputation: 2804

Try this:

def count_some(df):
    val = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA']  * 100)
    return val, val > .05

df[["Discrep_%", "Discrep_Found"]] = df.apply(count_some, axis=1, result_type='expand')

Upvotes: 1

Derek O
Derek O

Reputation: 19565

You could loop through the DataFrames. For example:

for df in [df1, df2, df3, ...]:
    df['Discrep_%'] = np.absolute(df['Clicks'] - df['Clicks_GA']) / (df['Clicks_GA']  * 100)
    df['Discrep_Found'] =  (df['Discrep_%'] > .05)

Upvotes: 0

Related Questions