What do people mean by "intermediate results" when talking about Hadoop, Spark, and Big Data?

Question

I'm trying to learn a little bit more on big data particularly with regards to utilizing Hadoop and Spark. However, I keep seeing this term "intermediate results" and I am not quite sure what it is referring to.

For example, I read that "Hadoop writes intermediate results to a computer's storage disk, while Spark keeps those same results in memory whenever possible." I was assuming that this was referring to results after Map Reduce, but I am not quite sure.

Can someone go into a little bit more detail into what "intermediate results" are and how they may vary between Spark and Hadoop?

OneCricketeer · Accepted Answer

Between the map phase and the reduce phase, there is a shuffle and sort operation performed on the data being processed, which is intermediate to the whole operation

What do people mean by "intermediate results" when talking about Hadoop, Spark, and Big Data?

Answers (1)

Related Questions

What do people mean by &quot;intermediate results&quot; when talking about Hadoop, Spark, and Big Data?

Answers (1)

Related Questions

What do people mean by "intermediate results" when talking about Hadoop, Spark, and Big Data?