Reputation: 41
i searched but couldn't get a concrete difference between the Apache distribution of spark 2 and the Cloudera distribution of spark 2. Can anybody help me on this in understanding the differences they have in spark core, spark sql and spark streaming.
Upvotes: 1
Views: 980
Reputation: 5957
They are referring to the same thing. Cloudera distributes a packaged version of Hadoop including Apache Spark 2. There are slight differences in this Apache Spark 2 and the latest upstream version of Spark 2 from https://spark.apache.org/. These are usually spelled out in the Release Notes for CDH Spark 2.
For example, the release notes have a section called: Spark 2 Known Issues which describe some missing features.
In general, incompatibilities arise because there is a lag between upstream releases and CDH releases and CDH has to maintain major version compatibility between minor releases.
Upvotes: 2