siash
siash

Reputation: 109

merge varying number of columns in pandas

i am trying to merge data frames with various columns. subsetting them i think requires different treatment depending if its with 1 or >1 columns so i tried with if statements, but its not working and im not sure why. any tips would be great thanks so much

edit - the index contains duplicate values so cant use pd.concat - i want to keep these

df1 = pd.DataFrame(data={'cat':[0,2,1], 'dog':[1,2,3]}).set_index([pd.Index([2, 2, 4])])
df2 = pd.DataFrame(data={'mouse':[1,2,3],'parrot':[0,1,2],'elephant':[0,1,2]}).set_index([pd.Index([1, 2, 4])])

# b can be varying but for this instance lets use two columns
# b = 'parrot'
b = 'parrot', 'mouse'

if len(b) > 0:
    if len(b) > 1:
        out = df1.merge(df2[[*b]],left_index=True, right_index=True)
    else:
        out = df1.merge(df2[b],left_index=True, right_index=True)

Upvotes: 1

Views: 30

Answers (2)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Assuming the dataframe does not contains multiindex columns, you can use .loc to select the required columns in df2 then use pd.concat to concatenate dataframe df1 with the selected columns along axis=1

pd.concat([df1, df2.loc[:, b]], axis=1)

Sample run:

# b = 'mouse'
   cat  dog  mouse
0    0    1      1
1    1    2      2
2    2    3      3


# b = 'mouse', 'parrot'
   cat  dog  mouse  parrot
0    0    1      1       0
1    1    2      2       1
2    2    3      3       2

Upvotes: 2

Jonathan Leon
Jonathan Leon

Reputation: 5648

b = 'parrot'
len(b) = 6

b = 'parrot', 'mouse'
len(b) = 2

You can fix this by using lists
b = ['parrot']
and 
b = ['parrot', 'mouse']

df2[[*b]] should become df2[b]

Upvotes: 0

Related Questions