Leokins
Leokins

Reputation: 89

PySpark Set Column value equal to another dataframe value if rows match

Hi I want to check a spark dataframe column value and set it based on checking if the row name matches with another dataframe row.

Example:

df1:
average name
3.5      n1
1.2      n2
4.2      n3

df2:
name    
n1     
n1        
n1    
n2
n3
n1
n2
n3
n3

df_i_want:
average name
3.5      n1
3.5      n1
3.5      n1
1.2      n2
4.2      n3
3.5      n1
1.2      n2
4.2      n3
4.2      n3

Upvotes: 0

Views: 1529

Answers (2)

User12345
User12345

Reputation: 5480

All you needed to do was a Join

You can achieve the result like below.

Join your data frame df2 with df1 on name and the select the order of columns you want

df3 = df2.join(df1, on = 'name').select('average', 'name')

The above code snippet should give you the desired result

Upvotes: 2

YOLO
YOLO

Reputation: 21719

You need a join to do this task:

## join both data on name
df3 = df2.join(df1, on='name',how='left')

# change column sequence
df3 = df3.select('average','name')

# order by name values
df3 = df3.orderBy('name', ascending=True)

Upvotes: 2

Related Questions