Reputation: 2591
I am having the following schema in spark and would like to flatten it.
root
|-- binlog_read_timestamp: string (nullable = true)
|-- row: struct (nullable = true)
| |-- after_values: struct (nullable = true)
| | |-- id: long (nullable = true)
| |-- before_values: struct (nullable = true)
| | |-- id: long (nullable = true)
| |-- values: struct (nullable = true)
| | |-- id: long (nullable = true)
|-- schema: string (nullable = true)
|-- table: string (nullable = true)
|-- type: string (nullable = true)
So depends on the value of type
, I want to do the following thing:
IF type == A THEN add new column with after_values.id
IF type == B THEN add new column with before_values.id
IF type == C THEN add new column with values.id
Any suggestions on how to do it? Thanks!
Upvotes: 0
Views: 1275
Reputation: 36
Try
from pyspark.sql.functions import *
df.withColumn("new_column",
when(col("type") == "A", col("after_values.id")) \
.when(col("type") == "B", col("before_values.id")) \
.when(col("type") == "C", col("values.id")))
Upvotes: 2