map reduce program to implement data structure in hadoop framework

Question

This is a data structure implementation in Hadoop. I want to implement indexing in Hadoop using map-reduce programming. Part 1 = I want to store this text file each word using index number in a table. [Able to complete] Part 2 = Now I want to perform the hashing for this newly created table [not able to complete] 1st part I am able to complete but 2nd part I m facing difficulty  Suppose if I have a text file containing 3 lines: how is your job how is your family hi how are you

I want to store this text file using indexing. I have map-reduce code that returns index value of every word, this index value I am able to store in index table (hash table) Output that contains index values of every word: how 0, how 14, is 3, is 18, job 12, your 7,

Now to store in hash table apply hashing for every word (index value) with modules (number of distinct elements in file) let say 4. For every index value of word and apply hash function (modules'%') to store in hash table. If there is a collision for same location then go to next location and store it.

  0%4=0(store 'how' at hash index 0)
  14%4=2(store 'how' at has index 2)
  18%4=2(store 'is' at hash index 3 because of collision) 
  7%4=3 (store 'your' at index 4 because of collision)

chandu kavar · Accepted Answer

you can create Hashtable object and put the key and value.

Hashtable hashtable = new Hashtable();

How to find key? Ans. you have total distinct words count and word's index. key = index % no of distinct word value = word

Before insert record in hashtable, check collision is occur or not for that key. How can I check collision occur? Ans.

boolean collision=hashtable.containsKey(key);

if collision is true, then linearly check for key+1, key+2,...and when you get collision is false, insert the key and value in hashtable using below line.

hashtable.put(key,value);

map reduce program to implement data structure in hadoop framework

Answers (1)

Related Questions