Reputation:
Can anyone please provide some guidance as why should we use Hadoop while now Spark is available? As we all know Spark was created in the first place to solve the limitations of Hadoop?
Thank you.
Upvotes: 1
Views: 165
Reputation: 139
Both spark and hadoop are based on concept of mapreduce. However spark is faster due to its in memory computation feature. Spark evolved itself into spark sql, mlib, streaming , however hadoop had other independent unrelated components to support these features for example pig, hive. The organisation of all the spark components under one component gave it a major boost. Now since hadoop is less abstracted than spark , so it provides more independence in customisation especially in the map and reduce phase. This customisations are however abstracted in case of spark.\
Upvotes: 0
Reputation: 7279
Hadoop has several components, including a distributed file system, HDFS, a parallel data processing framework, MapReduce, and a wide column store, HBase.
While Spark can be seen as a next-generation version of MapReduce with generalized dataflows (DAGs), Spark does not replace HDFS or HBase. Rather, it can consume data from HDFS and HBase as input, and write data back to them.
I hope this helps!
Upvotes: 1