Reputation: 43
I have a dataframe with these columns:
A B C D AA1 AA2 BB1 BB2 CC1 CC2
I set a list of tuples representing the name of these variables:
col2 = [
('AA1','AA2'),
('BB1','BB2'),
'CC1','CC2')
]
And a list with the first 4 variables:
col1 = ['A','B','C','D']
My aim is to create three different data frames (one for the variable AA, BB and CC) that contain the variables named in col1 by setting a for loop which iterates through each tuple in col2 and keeps AA1 and removes AA2 (and same for BB and CC).
This is my desired final output:
df1: A B C D AA1
df2: A B C D BB1
df3: A B C D CC1
I have tried with these function:
def func1(df, first, second):
df1 = pd.concat([df[col1],df[first[x]]],axis=1)
df1 = df.drop(second[y],axis=1)
df1 = df1.loc[:,~df1.columns.duplicated()]
return df1.reset_index(drop=True)
first1,second1 = zip(*col2)
first1 = list(first1)
second1 = list(second1)
for x,y in first1,second1:
df = func1(df_input,first=x,second=y)
output += [(df)]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-135-11df6ece8a12> in <module>()
5 second1 = list(second1)
6
----> 7 for x,y in first1,second1:
8
9 df = func1(df_input,first=x,second=y)
ValueError: too many values to unpack (expected 2)
I don't understand what I'm doing wrong...would anyone be able to help me?
thanks a lot
Upvotes: 0
Views: 102
Reputation: 375
masks = [col1 + [first] for first, second in col2]
# [['A', 'B', 'C', 'D', 'AA1'], ['A', 'B', 'C', 'D', 'BB1'], ['A', 'B', 'C', 'D', 'CC1']]
frames = [frame.filter(items=mask) for mask in masks]
frames
will be a list of pandas.DataFrames from which you can select the frames you want.
You can use list comprehension to create a list of different masks (i.e. ['A', 'B', 'C', 'D', 'AA1']
)
You can then again use list comprehesion to filter that specific mask from your original dataframe.
Upvotes: 0
Reputation: 3624
If I understood you well, why not simply:
df1 = df[['A', 'B', 'C', 'D', 'AA1']]
df2 = df[['A', 'B', 'C', 'D', 'BB1']]
df3 = df[['A', 'B', 'C', 'D', 'CC1']]
Upvotes: 1
Reputation: 23825
See below
first1 = [1, 2]
second1 = [3, 4]
for x, y in zip(first1, second1):
print('{} {}'.format(x, y))
output
1 3
2 4
Upvotes: 0