MYK
MYK

Reputation: 3007

How do you wrap existing functions on conditional logic in Python?

I have a codebase where this pattern is very common:

df # Some pandas dataframe with columns userId, sessionId 

def add_session_statistics(df):
   df_statistics = get_session_statistics(df.sessionId.unique())
   return df.merge(df_statistics, on='sessionId', how='left')

def add_user_statistics(df):
   df_statistics = add_user_statistics(df.userId.unique())
   return df.merge(df_statistics, on='sessionId', how='left')

# etc..

df_enriched = (df
               .pipe(add_session_statistics)
               .pipe(add_user_statistics)
               )

However, in another part of the codebase I have 'userId', 'sessionId' as the index of the dataframe. Something like:

X = df.set_index(['userId', 'sessionId'])

This means I can't use the add_{somthing}_statistics() functions on X without resetting the index each time.

Is there any decorator I can add to the add_{somthing}_statistics() to make them reset the index if they get a KeyError when attempting the merge on a column that is not there?

Upvotes: 0

Views: 35

Answers (1)

MYK
MYK

Reputation: 3007

This seems to work:

def index_suspension_on_add(add_function):
    
    def _helper(df):
        try:
            return df.pipe(add_function)
        except Exception:
            index_names = df.index.names 

            return (df
                    .reset_index()
                    .pipe(add_function)
                    .set_index(index_names)
            )
            
    return _helper

@index_suspension_on_add
def add_user_statistics(df):
    ...

Upvotes: 1

Related Questions