Reputation: 11744
I don't understand what types of apps can be used with Hadoop. Does each task have to be tailored for hadoop/mapreduce. For example, can you just associate any long running java processed with it? Or do you specifically have to tailor your app/task for hadoop. I guess a good example would be using lucene and hadoop for indexing.
Upvotes: 0
Views: 1235
Reputation: 1607
Basically, you have to be able to 'split' your task into independent tasks.
Upvotes: 0
Reputation: 10652
MapReduce is a processing model; it tells you exactly what your processing task should fit into.
Hadoop does (among other things) MapReduce with the added advantage that you can actually run a job reliably on 1000 systems in parallel (if you have enough independent pieces).
Given those constraints: some things cannot be done and a lot of things can be done. Analyzing logfiles (i.e. a large set of independent lines) or even webanalytics (every a single visitor/session did can be processed separately) are amongst the most common applications.
So yes, your task must be transformed to fit in the model for it to work.
Upvotes: 1
Reputation: 14234
Hadoop is really an engine that is for split/combine for processes. You split up a task in to similar sets of data [map] and then you combine the similar sets into a result [reduce/merge].
Its one way of making a parallel application. The maps and reduces are distributed to different nodes within the cluster. Its a very strict division of tasks and what data can be passed between the processes [must be serializable and disconnected to the data in the other maps/reduces]
Upvotes: 0