Reputation:
Below is the script I am working with. For practice, I've created two sets of dataframes, one set of df1,df2,and df3, and another set of dv1,dv2, and dv3. I then created two sets of lists, test and test2, which then combined as zip_list. Now, I am trying to create a loop function that will do the following. 1. Set index and create keys = 2022 and 2021. 2. Swap level so the columns are next to each other. The loop function works but gets only applied to only the first dataframe. Without calling each dataframe one by one, how can I apply it to the whole dataframes that are found in the zipped_list?
import pandas as pd
#Creating a set of dataframes
data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'logitech', 'samsung', 'lg', 'lenovo'],
'price': [1200, 150, 300, 450, 200]}
df1 = pd.DataFrame(data)
data2 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'mac', 'fujitsu', 'lg', 'asus'],
'price': [2200, 200, 300, 450, 200]}
df2 = pd.DataFrame(data2)
data3 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['microsoft', 'logitech', 'samsung', 'lg', 'asus'],
'price': [1500, 100, 200, 350, 400]}
df3 = pd.DataFrame(data3)
#Creating another set of dataframes
data = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'logitech', 'samsung', 'lg', 'lenovo'],
'price': [10, 20, 30, 40, 50]}
dv1 = pd.DataFrame(data)
data2 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['hp', 'mac', 'fujitsu', 'lg', 'asus'],
'price': [10, 20, 30, 50, 50]}
dv2 = pd.DataFrame(data2)
data3 = {'product_name': ['laptop', 'printer', 'tablet', 'desk', 'chair'],'item_name': ['microsoft', 'logitech', 'samsung', 'lg', 'asus'],
'price': [1, 2, 3, 4, 5]}
dv3 = pd.DataFrame(data3)
#creating a list for dataframe
test=[df1,df2,df3]
test2=[dv1,dv2,dv3]
#combining two lists
zipped = zip(test, test2)
zipped_list = list(zipped)
#Looping through the zipped_list
for x,y in zipped_list:
z=pd.concat([zipped_list[0][0].set_index(['product_name','item_name']), zipped_list[0][1].set_index(['product_name','item_name'])],
axis='columns', keys=['2022', '2021'])
z=z.swaplevel(axis='columns')[zipped_list[0][0].columns[2:]]
print(z)
In addition to this dataframe, there should be two more.
Upvotes: 1
Views: 297
Reputation: 335
The reason is that you only access 1 element of zipped_list and do not use the repeated element (x and y). You can create a new list and append the modified dataframe to that list:
new_list = []
for x in zipped_list:
z=pd.concat([x[0].set_index(['product_name','item_name']), x[1].set_index(['product_name','item_name'])],
axis='columns', keys=['2022', '2021'])
z=z.swaplevel(axis='columns')[x[0].columns[2:]]
new_list.append(z)
new_list
Output:
[ price
2022 2021
product_name item_name
laptop hp 1200 10
printer logitech 150 20
tablet samsung 300 30
desk lg 450 40
chair lenovo 200 50,
price
2022 2021
product_name item_name
laptop hp 2200 10
printer mac 200 20
tablet fujitsu 300 30
desk lg 450 50
chair asus 200 50,
price
2022 2021
product_name item_name
laptop microsoft 1500 1
printer logitech 100 2
tablet samsung 200 3
desk lg 350 4
chair asus 400 5]
Upvotes: 0