Looping through two lists of dataframe

Question

Below is the script I am working with. For practice, I've created two sets of dataframes, one set of df1,df2,and df3, and another set of dv1,dv2, and dv3. I then created two sets of lists, test and test2, which then combined as zip_list. Now, I am trying to create a loop function that will do the following. 1. Set index and create keys = 2022 and 2021. 2. Swap level so the columns are next to each other. The loop function works but gets only applied to only the first dataframe. Without calling each dataframe one by one, how can I apply it to the whole dataframes that are found in the zipped_list?

import pandas as pd
#Creating a set of dataframes
data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'logitech', 'samsung', 'lg', 'lenovo'],
        'price': [1200, 150, 300, 450, 200]}
df1 = pd.DataFrame(data)

data2 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'mac', 'fujitsu', 'lg', 'asus'],
        'price': [2200, 200, 300, 450, 200]}
df2 = pd.DataFrame(data2)

data3 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['microsoft', 'logitech', 'samsung', 'lg', 'asus'],
        'price': [1500, 100, 200, 350, 400]}
df3 = pd.DataFrame(data3)

#Creating another set of dataframes
data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'logitech', 'samsung', 'lg', 'lenovo'],
        'price': [10, 20, 30, 40, 50]}
dv1 = pd.DataFrame(data)

data2 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'mac', 'fujitsu', 'lg', 'asus'],
        'price': [10, 20, 30, 50, 50]}
dv2 = pd.DataFrame(data2)

data3 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['microsoft', 'logitech', 'samsung', 'lg', 'asus'],
        'price': [1, 2, 3, 4, 5]}
dv3 = pd.DataFrame(data3)

#creating a list for dataframe
test=[df1,df2,df3]
test2=[dv1,dv2,dv3]

#combining two lists
zipped = zip(test, test2)
zipped_list = list(zipped)

#Looping through the zipped_list
for x,y in zipped_list:
    z=pd.concat([zipped_list[0][0].set_index(['product_name','item_name']), zipped_list[0][1].set_index(['product_name','item_name'])], 
                    axis='columns', keys=['2022', '2021'])
    z=z.swaplevel(axis='columns')[zipped_list[0][0].columns[2:]] 
print(z)

In addition to this dataframe, there should be two more.

trandangtrungduc · Accepted Answer

The reason is that you only access 1 element of zipped_list and do not use the repeated element (x and y). You can create a new list and append the modified dataframe to that list:

new_list = []
for x in zipped_list:
    z=pd.concat([x[0].set_index(['product_name','item_name']), x[1].set_index(['product_name','item_name'])], 
                    axis='columns', keys=['2022', '2021'])
    z=z.swaplevel(axis='columns')[x[0].columns[2:]] 
    new_list.append(z)
new_list

Output:

[                       price     
                         2022 2021
 product_name item_name           
 laptop       hp         1200   10
 printer      logitech    150   20
 tablet       samsung     300   30
 desk         lg          450   40
 chair        lenovo      200   50,
                        price     
                         2022 2021
 product_name item_name           
 laptop       hp         2200   10
 printer      mac         200   20
 tablet       fujitsu     300   30
 desk         lg          450   50
 chair        asus        200   50,
                        price     
                         2022 2021
 product_name item_name           
 laptop       microsoft  1500    1
 printer      logitech    100    2
 tablet       samsung     200    3
 desk         lg          350    4
 chair        asus        400    5]

Looping through two lists of dataframe

Answers (1)

Related Questions