Charles
Charles

Reputation: 43

Loop through list of tuples by adding first element and remove second element from dataframe

I have a dataframe with these columns:

A B C D AA1 AA2 BB1 BB2 CC1 CC2

I set a list of tuples representing the name of these variables:

col2 = [
       ('AA1','AA2'),
       ('BB1','BB2'),
        'CC1','CC2')
]

And a list with the first 4 variables:

col1 = ['A','B','C','D']

My aim is to create three different data frames (one for the variable AA, BB and CC) that contain the variables named in col1 by setting a for loop which iterates through each tuple in col2 and keeps AA1 and removes AA2 (and same for BB and CC).

This is my desired final output:

df1: A B C D AA1

df2: A B C D BB1

df3: A B C D CC1

I have tried with these function:

def func1(df, first, second):


    df1 = pd.concat([df[col1],df[first[x]]],axis=1)

    df1 = df.drop(second[y],axis=1)

    df1 = df1.loc[:,~df1.columns.duplicated()]

    return df1.reset_index(drop=True)



first1,second1 = zip(*col2)

first1 = list(first1)

second1 = list(second1)

for x,y in first1,second1:

    df = func1(df_input,first=x,second=y)

    output += [(df)]


---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-135-11df6ece8a12> in <module>()
      5 second1 = list(second1)
      6 
----> 7 for x,y in first1,second1:
      8 
      9     df = func1(df_input,first=x,second=y)

ValueError: too many values to unpack (expected 2)

I don't understand what I'm doing wrong...would anyone be able to help me?

thanks a lot

Upvotes: 0

Views: 102

Answers (3)

Arno Maeckelberghe
Arno Maeckelberghe

Reputation: 375

masks = [col1 + [first] for first, second in col2]
# [['A', 'B', 'C', 'D', 'AA1'], ['A', 'B', 'C', 'D', 'BB1'], ['A', 'B', 'C', 'D', 'CC1']]
frames = [frame.filter(items=mask) for mask in masks]

frames will be a list of pandas.DataFrames from which you can select the frames you want.

You can use list comprehension to create a list of different masks (i.e. ['A', 'B', 'C', 'D', 'AA1']) You can then again use list comprehesion to filter that specific mask from your original dataframe.

Upvotes: 0

adnanmuttaleb
adnanmuttaleb

Reputation: 3624

If I understood you well, why not simply:

df1 = df[['A', 'B', 'C', 'D', 'AA1']]
df2 = df[['A', 'B', 'C', 'D', 'BB1']]
df3 = df[['A', 'B', 'C', 'D', 'CC1']]

Upvotes: 1

balderman
balderman

Reputation: 23825

See below

first1 = [1, 2]
second1 = [3, 4]
for x, y in zip(first1, second1):
    print('{} {}'.format(x, y))

output

1 3
2 4

Upvotes: 0

Related Questions