EA00
EA00

Reputation: 633

Matching values in dataframes and concat results

I have two data frames and I want to match the entries in column1 of df1 with the entries in column 1 and 3 of df2. If all three entries match, print lines from both dataframes in the output file. For example:

df1:
    a21 1  2  3
    a32 4  5  6
    a43 7  8  9
df2:
    b21 10 21 
    b21 10 22
    b43 10 43

output:
    21  1  2  3  10
    43  7  8  9  10

I'm currently doing the following and using str.extract to get the integers:

import pandas as pd
import re

df1 = pd.read_table('data1.txt', delim_whitespace= True)
df1.columns = ['1','2','3','4']
num = df1['1'].str.extract('(\d+)').astype(int)

df2 = pd.read_table('data2.txt', delim_whitespace= True)
df2.columns = ['1','2','3']
num2 = df2['1'].str.extract('(\d+)').astype(int)

both = [df1, df2]
if num == num2:
   if num == df2['3']:
      result = pd.concat(both, axis=1)
      print(result)

I'm not sure how to match the three values and concatenate the dataframes. Should I create dictionaries instead? Can someone point me in the right direction?

Upvotes: 0

Views: 54

Answers (1)

jpp
jpp

Reputation: 164663

This is one way.

import pandas as pd

df1 = pd.DataFrame([['a21', 1, 2, 3],
                    ['a32', 4, 5, 6],
                    ['a43', 7, 8, 9]],
                   columns=['1', '2', '3', '4'])

df2 = pd.DataFrame([['b21', 10, 21],
                    ['b21', 10, 22],
                    ['b43', 10, 43]],
                   columns=['1', '2', '3'])

df1.index = df1['1'].apply(lambda x: int(x[1:]))
df2.index = df2['1'].apply(lambda x: int(x[1:]))

# filter df2 for where index = column 3
df3 = df2[df2.index == df2['3']]

# join onto df1
df4 = df1.merge(df3, left_index=True, right_index=True).drop(['1_x', '1_y'], 1)

#     2_x  3_x  4  2_y  3_y
# 1                        
# 21    1    2  3   10   21
# 43    7    8  9   10   43

Upvotes: 1

Related Questions