viethungha0610
viethungha0610

Reputation: 13

How to explode each row that is an Array into columns in Spark (Scala)?

I have a Spark DataFrame with the a single column 'value', whereby each row is an Array of equal length. How can I explode this single 'value' column into multiple columns, which follow a schema like this?

Single-column DataFrame

val bronzeDfSchema = new StructType()
  .add("DATE", IntegerType)
  .add("NUMARTS", IntegerType)
  .add("COUNTS", StringType)
  .add("THEMES", StringType)
  .add("LOCATIONS", StringType)
  .add("PERSONS", StringType)
  .add("ORGANIZATIONS", StringType)
  .add("TONE", StringType)
  .add("CAMEOEVENTIDS", StringType)
  .add("SOURCES", StringType)
  .add("SOURCEURLS", StringType)

Thank you!

Upvotes: 1

Views: 416

Answers (1)

Rushabh Gujarathi
Rushabh Gujarathi

Reputation: 146

This should work just fine

val schema=Seq(("DATE",0),("NUMARTS",1),("COUNTS",2),("THEMES",3),("LOCATIONS",4),("PERSONS",5),("ORGANIZATIONS",6),("TONE",7),("CAMEOEVENTIDS",8),("SOURCES",9),("SOURCEURLS",10))

val df2=schema.foldLeft(df)((df,x)=>df.withColumn(x._1,col("value").getItem(x._2)))

After you do this just cast the column into the data type you want.

Upvotes: 2

Related Questions