S4M
S4M

Reputation: 4661

What can you do with hadoop that is impossible or very hard to do with Hive?

I am rather new to Hadoop and Hive, and would like an example of something that could be easily done with Hadoop but for which hive is not a good fit.

Upvotes: 0

Views: 405

Answers (2)

myui
myui

Reputation: 306

TF-IDF can be computed using Apache Hive with a Hivemall extension. https://github.com/myui/hivemall/wiki/TFIDF-calculation

To compute TF-IDF, 2 views and 1 query are just required. Easy!

Upvotes: 1

user248333
user248333

Reputation: 26

Everything that is not a "relational workload" (e.g. stuff you could also do with a SQL database) is not very well-suited for Hive. There is probably always a way to do it also with Hive (primarily because UDFs are available) but it would not be "easily".

You're differentiating between "Hadoop" and "Hive". However, "Hadoop" is a rather general term: It could mean "HDFS" (the distributed file system), "YARN" (the resource manager) or "Hadoop" as an implementation of the "Map Reduce" algorithm suggested by Google. I assume you refer to "Map Reduce" when comparing Hadoop and Hive.

I would say computing a page-rank with MapReduce is probably quite annoying with Hive. Another example would be computing TF-IDF with MapReduce.

Upvotes: 1

Related Questions