Reputation: 469
I wanna start to develop a recommendation system for big data, say 2GB log data per day. For this purpose, between Rhadoop and Apache Mahout, which one is preferred?
Please answer this question from different aspects, such as availability of codes, speed, et.
Upvotes: 0
Views: 382
Reputation: 5702
If you know R and your data is not that big try SparkR but most of the massive R package collection does not integrate well with Spark distributed data.
If you have big data a are ok with an R-like Scala API then Mahout is better. You can get your math working on sample data and the same code will automatically scale to production size.
Upvotes: 1