Yassir S
Yassir S

Reputation: 1042

logstah vs spark streaming and storm

I am working on building a distributed real time cluster system to supervise and analyze a network. I did several researches on internet and I came out with few technologies:

However, logstash is not often mentioned as spark streaming and storm. I found in internet the following architecture presented in the below picture:

enter image description here

I have two questions:

  1. I don't understand why logstash is not often mentioned as a real-tim processing system like spark streaming and storm. What are the main reasons ? I hav been using it and it is very powerful..

  2. Regarding the Analyze part, can I use the machine learning librairies in that configuration ?

Upvotes: 3

Views: 1878

Answers (1)

YaRiK
YaRiK

Reputation: 708

  1. Logstash is not cluster stream processing system. It is simply a JVM based process. The latest version supports on disk buffer but does not have the nearly the same delivery guaranties as Spark or Storm. Take a look at http://storm.apache.org/releases/1.0.3/Guaranteeing-message-processing.html
  2. Yes but not sure why use Elastic for storing data first. Why not HDFS->SparkML->Elastic? The main thing to think here is managing models, training and testing.

Upvotes: 2

Related Questions