aziz shaw
aziz shaw

Reputation: 144

get path of recursive for loop in python

here i am having a sample dataframe

test_list = [['male','pack','lower'], ['male','nonpack','upper'], ['female','pack','upper'], ['male','pack','middle'], ['female','nonpack','middle']]
df1=pd.DataFrame(test_list, columns=['gender', 'subscription', 'ageCategory'])

enter image description here

the output which i need is

 gender-male|subscription-pack|ageCategory-lower
 gender-male|subscription-pack|ageCategory-upper
 gender-male|subscription-pack|ageCategory-middle
 gender-male|subscription-nonpack|ageCategory-lower
 gender-male|subscription-nonpack|ageCategory-upper
 gender-male|subscription-nonpack|ageCategory-middle
 gender-female|subscription-pack|ageCategory-lower
 gender-female|subscription-pack|ageCategory-upper
 gender-female|subscription-pack|ageCategory-middle
 gender-female|subscription-nonpack|ageCategory-lower
 gender-female|subscription-nonpack|ageCategory-upper
 gender-female|subscription-nonpack|ageCategory-middle

the recursive function which I wrote to obtain the o/p

global name_list
name_list = []
global flag
flag=1
list_tuples = []
def do_calcuations1(iteration_list, level, df1, length=None):
    global name
    global flag
    if level < 3:       
        name_list.append(iteration_list[level])
        catogries = df1[iteration_list[level]].unique()
        for cata in catogries:
            name_list.append("-"+str(cata))   
            df2 = df1[df1[iteration_list[level]] == cata]
            do_calcuations1(iteration_list, level + 1, df2, len(catogries))
            name_list.pop()
    else:
        if level == 3:          
            print("the name from list is ", "|".join(name_list))
            info = (name, df1)
            list_tuples.append(info)         
            if length == flag:
                name_list.pop()
            else:
                flag += 1
        pass
iteration_list=['gender', 'subscription', 'ageCategory']
do_calcuations1(iteration_list,0,df1)

the output that I am getting is

 gender|-male|subscription|-pack|ageCategory|-lower
  gender|-male|subscription|-pack|ageCategory|-middle
  gender|-male|subscription|-nonpack|ageCategory|-upper
  gender|-male|subscription|-female|subscription|-pack|ageCategory|-upper
  gender|-male|subscription|-female|subscription|-pack|-nonpack|ageCategory|-middle

I am trying to save each unique data frame and its path as a tuple and appending it to a list can anyone show me the right approach to do this, Thanks!

Upvotes: 0

Views: 59

Answers (1)

Henry Yik
Henry Yik

Reputation: 22503

You can use pd.MultiIndex.from_product to create cartesian product of 3 columns:

s = (pd.MultiIndex.from_product([("gender-"+df1["gender"]).unique(),
                                   ("subscription-"+df1["subscription"]).unique(),
                                   ("ageCategory-"+df1["ageCategory"]).unique()],
                                  names=df1.columns)
       .to_frame(False))

print (s)

           gender          subscription         ageCategory
0     gender-male     subscription-pack   ageCategory-lower
1     gender-male     subscription-pack   ageCategory-upper
2     gender-male     subscription-pack  ageCategory-middle
3     gender-male  subscription-nonpack   ageCategory-lower
4     gender-male  subscription-nonpack   ageCategory-upper
5     gender-male  subscription-nonpack  ageCategory-middle
6   gender-female     subscription-pack   ageCategory-lower
7   gender-female     subscription-pack   ageCategory-upper
8   gender-female     subscription-pack  ageCategory-middle
9   gender-female  subscription-nonpack   ageCategory-lower
10  gender-female  subscription-nonpack   ageCategory-upper
11  gender-female  subscription-nonpack  ageCategory-middle

# As key vs value pairs:
print ({"|".join(y.iloc[0]): y for _, y in s.groupby(level=0)})

Upvotes: 1

Related Questions