Reputation: 373
How can I query a table using isin()
with another dataframe? For example there is this dataframe, df1
:
| id | rank |
|---------|------|
| SE34SER | 1 |
| SEF3445 | 2 |
| 5W4G4F | 3 |
I want to query a table where a column in the table isin(df1.id)
. I tried doing so like this:
t = (
spark.table('mytable')
.where(sf.col('id').isin(df1.id))
.select('*')
).show()
However it errors:
AttributeError: 'NoneType' object has no attribute 'id'
Upvotes: 3
Views: 3679
Reputation: 5487
Unfortunately, you can't pass another dataframe's column to isin() method. You can get all the values of that column in a list and pass list to isin() method but this is not a better approach.
You can do inner join between those 2 dataframes.
df2 = spark.table('mytable')
df2.join(df1.select('id'),df1.id == df2.id, 'inner')
Upvotes: 5