CLF
CLF

Reputation: 155

Lamina vs Storm

I am designing a prototype realtime monitor for processing fairly large amounts (>30G/day) of streaming numeric data. I would like to write this in Clojure, as the language seems to be well suited to the kind of "Observer + state machine" system that this will probably end up as.

The two main candidates I have found for a framework are Lamina and Storm. There is also Riemann and Pulse, but the former seems to be more of a full solution rather than a framework, and I'd rather not commit to a final design yet; Pulse's repo looks a little unmaintained?

What I would like to know is; what kinds of data- and work flow are these two projects optimised for? Storm seems to be more mature, but Lamina seems more composable and "Clojureic" (my background is Python, so I tend to rate this highly).

What I've found from reading online:

Upvotes: 5

Views: 808

Answers (3)

amalloy
amalloy

Reputation: 92117

Storm probably isn't a bad choice, but "over 30GB per day" of numeric data isn't big data, it is tiny data. Any semi-modern computer can handle that much data easily on one node with lamina. You might want to go with Storm anyway, so that once you do get into a realm where you need more servers you can scale easily, but I imagine there's some initial friction to getting Storm set up (and some ongoing friction in maintaining the cluster), which will be wasted if you never have to scale up.

Upvotes: 8

Gordon Seidoh Worley
Gordon Seidoh Worley

Reputation: 8078

Lamina seems like an okay choice, but it appears to be totally lacking the killer feature of Storm--cluster computing management. A Storm cluster will take care of most of the dirty work of distributing your computation across a cluster of nodes, leaving you to just focus on your business logic so long as you fit it within the Storm framework. Lamina, from what I can see, provides a nice way to organize your computation, but then you'll have to take care of all the details of scaling that out if that's something you need.

Upvotes: 1

Arthur Ulfeldt
Arthur Ulfeldt

Reputation: 91587

Storm incorporates cluster management and handling of failed nodes in the flow because it was designed to be sort of "like Hadoop but for streaming", which from what I understand of your requirements seems to be closer to your use case.

Upvotes: 4

Related Questions