tlhy
tlhy

Reputation: 125

Merging 2 dataframes on Pandas

Sorry I have a very simple question. So I have two dataframes that look like Dataframe 1: columns: a b c d e f g h

Dataframe 2: columns: e ef

I'm trying to join Dataframe 2 on Dataframe 1 at column e, which should yield columns: a b c d e ef g h or columns: a b c d e f g h ef

However: df1.merge(df2, how = 'inner', on = 'e') yields a blank dataframe when I print it out.

'outer' merge only extends the dataframe vertically (like using an append function).

Would appreciate some help thank you!

Upvotes: 1

Views: 107

Answers (3)

jezrael
jezrael

Reputation: 862406

You need same dtypes of columns for join, so need converting:

#convert string column to int
df1['e'] = df1['e'].astype(int)
#inner is default value, so can be omit
df1.merge(df2, on = 'e') 

Sample:

df1 = pd.DataFrame({'a':list('abcdef'),
                   'b':[4,5,4,5,5,4],
                   'c':[7,8,9,4,2,3],
                   'd':[1,3,5,7,1,0],
                   'e':['5','3','6','9','2','4'],
                   'f':list('aaabbb'),
                   'g':[1,3,5,7,1,0]})

print (df1)
   a  b  c  d  e  f  g
0  a  4  7  1  5  a  1
1  b  5  8  3  3  a  3
2  c  4  9  5  6  a  5
3  d  5  4  7  9  b  7
4  e  5  2  1  2  b  1
5  f  4  3  0  4  b  0

df2 = pd.DataFrame({'ef':[10,30,50,70,10,100],
                   'e':[5,3,6,9,0,7]})
print (df2)
   e   ef
0  5   10
1  3   30
2  6   50
3  9   70
4  0   10
5  7  100

df1['e'] = df1['e'].astype(int)
df = df1.merge(df2, on = 'e') 
print (df)
   a  b  c  d  e  f  g  ef
0  a  4  7  1  5  a  1  10
1  b  5  8  3  3  a  3  30
2  c  4  9  5  6  a  5  50
3  d  5  4  7  9  b  7  70

Upvotes: 1

Tilak Putta
Tilak Putta

Reputation: 778

You can do it like this:

def mergeDfs(df1,df2):
    newDf = dict()
    dfList = []
    for i in df1:
        l = len(i)
        row = []
        for j in range(l):
            row.append(df1[i][j])
        newDf[i] = row
        dfList.append(i)
    for i in df2:
        l = len(i)
        row = []
        if i not in dfList:
            for j in range(l):
                row.append(df2[i][j])
            newDf[i] = row
    df = pd.DataFrame(newDf)
    return df

Upvotes: 0

Adam Schroeder
Adam Schroeder

Reputation: 768

Instead of

df1.merge(...)

try:

pd.merge(left=df1, right=df2, on ='e', how='inner')

Upvotes: 0

Related Questions