Reputation: 1634
I wrote program for Spark Streaming in scala. In my program, i passed 'remote-host' and 'remote port' under socketTextStream.
And in the remote machine, i have one perl script who is calling system command:
echo 'data_str' | nc <remote_host> <9999>
In that way, my spark program is able to get data, but it seems little bit confusing as i have multiple remote machines which needs to send data to spark machine. I wanted to know the right way of doing it. Infact, how will i deal with data coming from multiple hosts?
For Reference, My current program:
def main(args: Array[String]): Unit = {
val conf = new SparkConf().setAppName("HBaseStream")
val sc = new SparkContext(conf)
val ssc = new StreamingContext(sc, Seconds(2))
val inputStream = ssc.socketTextStream(<remote-host>, 9999)
-------------------
-------------------
ssc.start()
// Wait for the computation to terminate
ssc.awaitTermination()
}
}
Thanks in advance.
Upvotes: 1
Views: 384
Reputation: 3228
You can find more information from "Level of Parallelism in Data Receiving".
Summary:
Upvotes: 1