Reputation: 103
As you know, transform
API has been integrated in Spark3.0.0, but I have tired and don't know how to use it and can't Google any usages. Can anyone give me a usage? Than you!
What I have tired:
val source = spark.read.format("json").option("multiLine", "true").load("/home/user/Desktop/test.json")
source.select(transform($"array0",x =>struct($"x.a".as("A")) ))
org.apache.spark.sql.AnalysisException: cannot resolve '`x.a`' given input columns: [array0];;
'Project [transform(array0#0, lambdafunction(named_struct(NamePlaceholder, 'x.a), lambda x#4, false)) AS transform(array0, lambdafunction(named_struct(NamePlaceholder(), x.a AS `A`), x))#3]
+- RelationV2[array0#0] json file:/home/usr/Desktop/test.json
my source json:
{
"array0":[
{
"a":"0",
"b":"1"
}
]
}
Upvotes: 0
Views: 68
Reputation: 3344
If you mean the higher order function transform
used with arrays, here is a simple working example:
val df = spark.range(2).withColumn("arr", array(lit(1), lit(2)))
df.withColumn("x", transform($"arr", x => x + 1)).show()
+---+------+------+
| id| arr| x|
+---+------+------+
| 0|[1, 2]|[2, 3]|
| 1|[1, 2]|[2, 3]|
+---+------+------+
In your example since you have structs inside the array, you can access the elements of the struct as follows:
df.withColumn("x", transform($"arr", x => x.getItem("a") + 1))
Upvotes: 1