Ach Raf
Ach Raf

Reputation: 31

How does Hadoop deals with files with no key-value structure

I am new to Hadoop and I am learning the Map Reduce paradigm. In the tutorial I am following it is said that the map reduce approach tends to be apply two operataions (map and reduce) based on the Key-Value of the file. I know that hadoop deals also with unstructured data so I was wondering how it would handle map reduce in the case of unstructured data.

Upvotes: 0

Views: 91

Answers (1)

OneCricketeer
OneCricketeer

Reputation: 191864

Take the example of the text

Hello
World

There are two lines of text, but there is naturally a key and a value, the file offset and the line itself. If you hex dump the file, you'd see something like so

0x0 Hello
0x6 World

This is how HDFS knows how to split plaintext files into blocks, and so mapreduce (and other runtime engines) can be used to read that data.

If you're storing video, images, audio, pdf documents, etc, then you must implement your own InputFormat reader to determine how the bytes of the file should be structured and parallelized, if at all

Upvotes: 0

Related Questions