user2007506
user2007506

Reputation: 79

python crosstab function in a loop

I am working in python for quiet a time, but stuck at simple problem.I have to run crosstab functions for different variables with same ID variable(masteruserid)

pd.crosstab(data['MasterUserId'],visittime_cat)
pd.crosstab(data['MasterUserId'],week_cat)

Now I want to do the same about 7-8 times. Instead of calling the crosstab function recurvisely, I want to put inside a loop and generate a crosstab dataset for each iteration. I tried this, but was not successful

def cross_tab(id_col,field):
    col_names=['visittime_cat','week_cat','var3','var4']
    for i in col_names:
        'crosstab_{ }'.format(i)=pd.crosstab(id_col,i)

I want to generate datasets such as crosstab_visittime_cat,crosstab_week_cat or as crosstab_1, crosstab_2 and so on.

Upvotes: 1

Views: 2143

Answers (1)

ODiogoSilva
ODiogoSilva

Reputation: 2414

Might I suggesting storing the datasets in a dictionary?

def cross_tab(data_frame, id_col):
    col_names=['visittime_cat','week_cat','var3','var4']
    datasets = {}
    for i in col_names:
        datasets['crosstab_{}'.format(i)] = pd.crosstab(data_frame[id_col], data_frame[i])
    return datasets

Testing with a fictional data set

import numpy as np
import pandas as pd

data = pd.DataFrame({'MasterUserId': ['one', 'one', 'two', 'three'] * 6,
             'visittime_cat': ['A', 'B', 'C'] * 8,
             'week_cat': ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4,
             'var3': np.random.randn(24),
             'var4': np.random.randn(24)})

storage = cross_tab(data, "MasterUserId")

storage.keys()
['crosstab_week_cat', 'crosstab_var4', 'crosstab_visittime_cat', 'crosstab_var3']

storage['crosstab_week_cat']
week_cat      bar  foo
MasterUserId          
one             6    6
three           4    2
two             2    4

[3 rows x 2 columns]

Upvotes: 1

Related Questions