Reputation: 619
I have data in the form of Array[Byte] which I want to convert into Spark RDD or DataFrame so that I can write my data directly into a Google bucket in the form of a file. I am not able to write Array[Byte] data into Google bucket directly. So looking for this conversion.
My below code is able to write data into Local FS, but not Google bucket
val encrypted = encrypt(original, readPublicKey(pubKey), outFile, true, true)
val dfis = new FileOutputStream(outFile)
dfis.write(encrypted)
dfis.close()
def encrypt(clearData: Array[Byte], encKey: PGPPublicKey, fileName: String, withIntegrityCheck: Boolean, armor: Boolean): Array[Byte] = {
...
}
So how can I convert Array[Byte] data to RDD or DataFrame? I am using Scala.
Upvotes: 1
Views: 2039
Reputation: 2451
just use .toDF()
or .toDF().rdd
scala> val arr: Array[Byte] = Array(192.toByte, 168.toByte, 1.toByte, 4.toByte)
arr: Array[Byte] = Array(-64, -88, 1, 4)
scala> val df = arr.toSeq.toDF()
df: org.apache.spark.sql.DataFrame = [value: tinyint]
scala> df.show()
+-----+
|value|
+-----+
| -64|
| -88|
| 1|
| 4|
+-----+
scala> df.printSchema()
root
|-- value: byte (nullable = false)
Upvotes: 2