Anderson Dutra
Anderson Dutra

Reputation: 101

Add new column with literal value to a struct column in Dataframe in Spark Scala

I have a dataframe with the following schema:

root
 |-- docnumber: string (nullable = true)
 |-- event: struct (nullable = false)
 |    |-- data: struct (nullable = true)
           |-- codevent: int (nullable = true)

I need to add a column inside event.data so that the schema would be like:

root
 |-- docnumber: string (nullable = true)
 |-- event: struct (nullable = false)
 |    |-- data: struct (nullable = true)
           |-- codevent: int (nullable = true)
           |-- needtoaddit: int (nullable = true)

I tried

How can I make it work?

Upvotes: 2

Views: 1425

Answers (2)

ZygD
ZygD

Reputation: 24356

Spark 3.1+

To add fields inside struct columns, use withField

col("event.data").withField("needtoaddit", lit("added"))

Input:

val df = spark.createDataFrame(Seq(("1", 2)))
    .select(
        col("_1").as("docnumber"),
        struct(struct(col("_2").as("codevent")).as("data")).as("event")
    )
df.printSchema()
// root
//  |-- docnumber: string (nullable = true)
//  |-- event: struct (nullable = false)
//  |    |-- data: struct (nullable = false)
//  |    |    |-- codevent: long (nullable = true)

Script:

val df2 = df.withColumn(
    "event",
    col("event.data").withField("needtoaddit", lit("added"))
)

df2.printSchema()
// root
//  |-- docnumber: string (nullable = true)
//  |-- event: struct (nullable = false)
//  |    |-- data: struct (nullable = true)
//            |-- codevent: int (nullable = true)
//            |-- needtoaddit: int (nullable = true)

Upvotes: 3

mck
mck

Reputation: 42342

You're kind of close. Try this code:

val df2 = df.withColumn(
    "event", 
    struct(
        struct(
            $"event.data.*", 
            lit("added").as("needtoaddit")
        ).as("data")
    )
)

Upvotes: 1

Related Questions