PradeepKumbhar
PradeepKumbhar

Reputation: 3421

How Apache Apex is different from Apache Storm?

Apache Apex looks similar to Apache Storm.

So, at a glance, both look similar and I'm not quite getting the difference. Can someone please explain what are the key differences? In other words, when should I use one instead of the other?

Upvotes: 9

Views: 1337

Answers (2)

brusli
brusli

Reputation: 79

Architecture and Features

+-------------------+---------------------------+---------------------+
|                   |           Storm           |         Apex        |
+-------------------+---------------------------+---------------------+
| Model             | Native Streaming          | Native Streaming    |
|                   | Micro batch (Trident      |                     |
+-------------------+---------------------------+---------------------+
| Language          | Java.                     | Java (Scala)        |
|                   | Ability to use non        |                     |
|                   | JVM languages support     |                     |
+-------------------+---------------------------+---------------------+
| API               | Compositional             | Compositional (DAG) |
|                   | Declarative (Trident)     | Declarative         |
|                   | Limited SQL               |                     |
|                   | support (Trident)         |                     |
+-------------------+---------------------------+---------------------+
| Locality          | Data Locality             | Advance Processing  |
+-------------------+---------------------------+---------------------+
| Latency           | Low                       | Very Low            |
|                   | High (Trident)            |                     |
+-------------------+---------------------------+---------------------+
| Throughput        | Limited in Ack mode       | Very high           |
+-------------------+---------------------------+---------------------+
| Scalibility       | Limited due to Ack        | Horizontal          |
+-------------------+---------------------------+---------------------+
| Partitioning      | Standard                  | Advance             |
|                   | Set parallelism at work,  | Parallel pipes,     |
|                   | executor and task level   | unifiers            |
+-------------------+---------------------------+---------------------+
| Connector Library | Limited (certification)   | Rich library of     |
|                   |                           | connectors in       |
|                   |                           | Apex Malhar         |
+-------------------+---------------------------+---------------------+

Operability

+------------+--------------------------+---------------------+
|            |           Storm          |         Apex        |
+------------+--------------------------+---------------------+
| State      | External store           | Checkpointing       |
| Management | Limited checkpointing    | Local checkpointing |
|            | Difficult to exploit     |                     |
|            | local state              |                     |
+------------+--------------------------+---------------------+
| Recovery   | Cumbersome API to        | Incremental         |
|            | store and retrieve state | (buffer server)     |
|            | Require user code        |                     |
+------------+--------------------------+---------------------+
| Processing | At least once            |                     |
| Semantic   | Exactly once require     | At least once       |
|            | user code and affect     | End to end          |
|            | latency                  |                     |
|            |                          | exactly once        |
+------------+--------------------------+---------------------+
| Back       | Watermark on queue       | Automatic           |
| Pressure   | size for spout and bolt  | Buffer server       |
|            | Does not scale           | memory and disk     |
+------------+--------------------------+---------------------+
| Elasticity | Through CLI only         | Yes w/ full user    |
|            |                          | control             |
+------------+--------------------------+---------------------+
| Dynamic    | No                       | Yes                 |
| topology   |                          |                     |
+------------+--------------------------+---------------------+
| Security   | Kerberos                 | Kerberos, RBAC,     |
|            |                          | LDAP                |
+------------+--------------------------+---------------------+
| Multi      | Mesos, RAS - memory,     | YARN                |
| Tenancy    | CPU, YARN                | full isolation      |
+------------+--------------------------+---------------------+
| DevOps     | REST API                 | REST API            |
| Tools      | Basic UI                 | DataTorrent RTS     |
+------------+--------------------------+---------------------+

Source: Webinar: Apache Apex (Next Gen Hadoop) vs. Storm - Comparison and Migration Outline https://www.youtube.com/watch?v=sPjyo2HfD_I

Upvotes: 3

ashwin111
ashwin111

Reputation: 146

There are fundamental differences in architecture which make each of the platform very different in terms of latency, scaling and state management.

At the very basic level,

  1. Apache Storm uses record acknowledgement to guarantee message delivery.
  2. Apache Apex uses checkpointing to guarantee message delivery.

You can learn more differences in the following blog which also includes other main stream processing platforms out there.

https://databaseline.wordpress.com/2016/03/12/an-overview-of-apache-streaming-technologies/

Upvotes: 3

Related Questions