Why are Random Forests easier to parallelize than Gradient Boosted Machines?

Is it because GBMs, each decision tree is dependent on the previous decision trees? In other words, there is no independence?

Upvotes: 2

Answers (1)

desertnaut

Reputation: 60388

As you have already suspected, it is exactly because in GBM, each decision tree depends on the previous ones, so the trees cannot be fit independently, thus parallelization is in principle not possible.

Consider the following excerpt, quoted from The Elements of Statistical Learning, Ch. 10 (Boosting and Additive Trees), pp. 337-339 (emphasis mine):

A weak classifier is one whose error rate is only slightly better than random guessing. The purpose of boosting is to sequentially apply the weak classification algorithm to repeatedly modified versions of the data, thereby producing a sequence of weak classifiers Gm(x), m = 1, 2, . . . , M. The predictions from all of them are then combined through a weighted majority vote to produce the final prediction. [...] Each successive classifier is thereby forced to concentrate on those training observations that are missed by previous ones in the sequence.

In a picture (ibid, p. 338):

enter image description here

In Random Forest, on the other hand, all trees are independent, thus the parallelization of the algorithm is relatively straightforward.

Upvotes: 1

Why are Random Forests easier to parallelize than Gradient Boosted Machines?

Answers (1)

Related Questions