Osca
Osca

Reputation: 1694

Merge dataframes base on Column and Row values

I have a dataframe that I'd like to merge with another dataframe with the same column values. Also with specified row values.

Dataframe 1

d = {'id': ['111', '222', '333'], 'queries': ['High', 'Mid', 'Low'], 'time_stay': ['High', 'Mid', 'Low']}
dd = pd.DataFrame(data=d)

Dataframe 2

l = {'Features': ['queries', 'queries', 'queries', 'time_stay', 'time_stay', 'time_stay'], 'groups':['High', 'Mid', 'Low', 'High', 'Mid', 'Low'], 'parameters':[1.2, 1.1, 1.0, 1000, 2000, 3000]}
feature_data = pd.DataFrame(data=l)
feature_data

I pivoted dataframe 2 to make the first row as columns.

feature_data = feature_data.T
feature_data.columns = feature_data.loc['Features', :]

Then I merged it dd.merge(feature_data, on=list(feature_data.columns), how='left') As expected, pandas doesn't let me merge it because column queries is duplicated.

Expected output

enter image description here

What's a better way to do this ? thanks

Upvotes: 0

Views: 84

Answers (1)

Priyank
Priyank

Reputation: 144

Filter for column values in feature_data dataframe and then merge it to dd dataframe

cols_name = 'queries'
queries = feature_data[feature_data['Features']==cols_name]
dd.merge(queries[['groups','parameters']],
         left_on=['queries'],
         right_on=['groups'],
         how="left")
        .drop(columns='groups')
print(dd)

id  queries time_stay   parameters
111 High    High        1.2
222 Mid     Mid         1.1
333 Low     Low         1.0

Upvotes: 1

Related Questions