Reputation: 914
I am new to spark, and try to write some example code base on spark and spark streaming.
So far, I have implemented sorting function in spark, here is the code:
def sort(listSize: Int, slice: Int): Unit = {
val conf = new SparkConf().setAppName(getClass.getName)
val spark = new SparkContext(conf)
val data = genRandom(listSize)
val distData = spark.parallelize(data, slice)
val result = distData.sortBy(x => x, true)
val finalResult = result.collect()
val step5 = System.currentTimeMillis()
printlnArray(finalResult, 0, 10)
spark.stop()
}
/**
* generate random number
* @return
*/
def genRandom(listSize: Int): List[Int] = {
val range = 100000
var listBuffer = new ListBuffer[Int]
val random = new Random()
for (i <- 1 to listSize) listBuffer += random.nextInt(range)
listBuffer.toList
}
def printlnArray(list: Array[Int], start: Int, offset: Int) {
for (i <- start until start + offset) println(">>>>>>>>> list : " + i + " | " + list(i))
}
I have a trouble on implementing sort function on spark streaming. As I know, spark RDD provide sort API in spark core, but there is not such API in spark streaming, Do anyone know how to do it ? Thanks
This is a dump question, but after google on web, I does not find an right answer. If anyone know how to solve it, thanks for your help.
Upvotes: 6
Views: 5458
Reputation: 2235
You can leverage the transform function of a DStream
to transform it by using the underlying RDDs.
For instance
myDStream.transform(rdd => rdd.sortByKey())
Upvotes: 5