Reputation: 3184
I have been bogged down by this for some hours now... tried collect and mkString(") and still i am not able to print in console or save as text file.
scala> val au1 = sc.parallelize(List(("a",Array(1,2)),("b",Array(1,2))))
scala> val au2 = sc.parallelize(List(("a",Array(3)),("b",Array(2))))
scala> val au3 = au1.union(au2)
Result of the union is
Array[(String,Array[int])] = Array((a,Array(1,2)),(b,Array(1,2)),(a,Array(3)),(b,Array(2)))
All the print attempts are resulting in following when i do x(0) and x(1)
Array[Int]) does not take parameters
Last attempt, performed following and it is resulting in index error
scala> val au4 = au3.map(x => (x._1, x._2._1._1, x._2._1._2))
<console>:33: error: value _1 is not a member of Array[Int]
val au4 = au3.map(x => (x._1, x._2._1._1, x._2._1._2))
Upvotes: 1
Views: 1286
Reputation: 23119
The result of au3 is not Array[(String,Array[int])]
, it is RDD[(String,Array[int])]
so this how you could do to write output in a file
au3.map( r => (r._1, r._2.map(_.toString).mkString(",")))
.saveAsTextFile("data/result")
You need to map through the array and create a string from it so that it could be written in file as
(a,1:2)
(b,1:2)
(a,3)
(b,2)
You could write to file without brackets as below
au3.map( r => Row(r._1, r._2.map(_.toString).mkString(":")).mkString(","))
.saveAsTextFile("data/result")
Output:
a,1:2
b,1:2
a,3
b,2
The value is comma ","
separated and array value are separated as ":"
Hope this helps!
Upvotes: 2
Reputation: 41987
._1
or ._2
can be done in tuples
and not in arrays
("a",Array(1,2))
is a tuple so ._1
is a
and ._2
is Array(1,2)
so if you want to get elements of an array you need to use ()
as x._2(0)
but au2
arrays has only one element so x._2(1)
will work for au1
and not for au2
. You can use Option
or Try
as below
val au4 = au3.map(x => (x._1, x._2(0), Try(x._2(1)) getOrElse(0)))
Upvotes: 3