Olscream
Olscream

Reputation: 127

Scala Spark - Get String row values from a WrappedArray

I'm writing a Scala script for Spark and I have a "specialArray" as following :

 specialArray = ...
 specialArray.show(6)
 __________________________ console __________________________________

 specialArray: org.apache.spark.sql.DataFrame = [_VALUE: array<string>]
 +--------------+
 |        _VALUE|
 +--------------+
 |    [fullForm]|
 |    [fullForm]|
 |    [fullForm]|
 |    [fullForm]|
 |    [fullForm]|
 |    [fullForm]|
 |    [fullForm]|
 +--------------+
 only showing top 6 rows

But would like to see the content of those "fullForm" sub-arrays, how would you do this, please ?

Thank you very much in advance!

I have already tried to get the first value in this way :

val resultTest = specialArray.map(s => s.toString).toDF().collect()(0)
__________________________ console __________________________________
resultTest: org.apache.spark.sql.Row = [[WrappedArray(fullForm)]]

So I don't know how to deal with that and I haven't found anything "effective" in thdoc: : https://www.scala-lang.org/api/current/scala/collection/mutable/WrappedArray.html.

If you have any ideas or you have some questions to ask me, feel free to leave a message, thanks:).

Upvotes: 2

Views: 2354

Answers (1)

koiralo
koiralo

Reputation: 23109

Here specialArray is a dataframe, So to see the schema of dataframe you use specialArray.printSchema, Which shows the datatypes inside the dataframe.

If you just want to see the data inside the dataframe you can use

specialArray.show(6, false) the parameter false is not to truncate while displaying long values.

Next thing you can do is use select or withColumn to change the WrappedArray to the comma-separated (or any separator) String

import org.apache.spark.sql.functions._
df.select(concat_ws(",", $"_VALUE")).show(false)
df.withColumn("_VALUE", concat_ws(",", $"_VALUE")).show(false)

Hope this help!

Upvotes: 1

Related Questions