Reputation: 144
here i am having a sample dataframe
test_list = [['male','pack','lower'], ['male','nonpack','upper'], ['female','pack','upper'], ['male','pack','middle'], ['female','nonpack','middle']]
df1=pd.DataFrame(test_list, columns=['gender', 'subscription', 'ageCategory'])
the output which i need is
gender-male|subscription-pack|ageCategory-lower
gender-male|subscription-pack|ageCategory-upper
gender-male|subscription-pack|ageCategory-middle
gender-male|subscription-nonpack|ageCategory-lower
gender-male|subscription-nonpack|ageCategory-upper
gender-male|subscription-nonpack|ageCategory-middle
gender-female|subscription-pack|ageCategory-lower
gender-female|subscription-pack|ageCategory-upper
gender-female|subscription-pack|ageCategory-middle
gender-female|subscription-nonpack|ageCategory-lower
gender-female|subscription-nonpack|ageCategory-upper
gender-female|subscription-nonpack|ageCategory-middle
the recursive function which I wrote to obtain the o/p
global name_list
name_list = []
global flag
flag=1
list_tuples = []
def do_calcuations1(iteration_list, level, df1, length=None):
global name
global flag
if level < 3:
name_list.append(iteration_list[level])
catogries = df1[iteration_list[level]].unique()
for cata in catogries:
name_list.append("-"+str(cata))
df2 = df1[df1[iteration_list[level]] == cata]
do_calcuations1(iteration_list, level + 1, df2, len(catogries))
name_list.pop()
else:
if level == 3:
print("the name from list is ", "|".join(name_list))
info = (name, df1)
list_tuples.append(info)
if length == flag:
name_list.pop()
else:
flag += 1
pass
iteration_list=['gender', 'subscription', 'ageCategory']
do_calcuations1(iteration_list,0,df1)
the output that I am getting is
gender|-male|subscription|-pack|ageCategory|-lower
gender|-male|subscription|-pack|ageCategory|-middle
gender|-male|subscription|-nonpack|ageCategory|-upper
gender|-male|subscription|-female|subscription|-pack|ageCategory|-upper
gender|-male|subscription|-female|subscription|-pack|-nonpack|ageCategory|-middle
I am trying to save each unique data frame and its path as a tuple and appending it to a list can anyone show me the right approach to do this, Thanks!
Upvotes: 0
Views: 59
Reputation: 22503
You can use pd.MultiIndex.from_product
to create cartesian product of 3 columns:
s = (pd.MultiIndex.from_product([("gender-"+df1["gender"]).unique(),
("subscription-"+df1["subscription"]).unique(),
("ageCategory-"+df1["ageCategory"]).unique()],
names=df1.columns)
.to_frame(False))
print (s)
gender subscription ageCategory
0 gender-male subscription-pack ageCategory-lower
1 gender-male subscription-pack ageCategory-upper
2 gender-male subscription-pack ageCategory-middle
3 gender-male subscription-nonpack ageCategory-lower
4 gender-male subscription-nonpack ageCategory-upper
5 gender-male subscription-nonpack ageCategory-middle
6 gender-female subscription-pack ageCategory-lower
7 gender-female subscription-pack ageCategory-upper
8 gender-female subscription-pack ageCategory-middle
9 gender-female subscription-nonpack ageCategory-lower
10 gender-female subscription-nonpack ageCategory-upper
11 gender-female subscription-nonpack ageCategory-middle
# As key vs value pairs:
print ({"|".join(y.iloc[0]): y for _, y in s.groupby(level=0)})
Upvotes: 1