Reputation: 137
I have a spark dataframe columns 'id' and 'articles' and a list of values 'a_list' as below.
df = spark.createDataFrame([(1, 4), (2, 3), (5, 6)], ("id", "articles"))
a_list = [1, 4, 6]
I am trying to compare list value with value of dataframe column "articles" and if match found updating column 'E' to 1 else 0
I am using "isin" in my code below
df['E'] = df.articles.isin(a_list).astype(int)
Getting
TypeError: unexpected type:
<type 'type'>
What am I missing here ?
Upvotes: 3
Views: 3386
Reputation: 214957
Provide your type as string "int"
instead of int
which is python's native type
that spark doesn't recognize; Also to create a column in spark data frame, use withColumn
method instead of direct assignment:
df.withColumn('E', df.articles.isin(a_list).astype('int')).show()
+---+--------+---+
| id|articles| E|
+---+--------+---+
| 1| 4| 1|
| 2| 3| 0|
| 5| 6| 1|
+---+--------+---+
Upvotes: 2