Hadoop MapReduce: Two Keys in one line, but how?

Question

I have csv files containing records separated by semicolon. Each line is one record. Each line contains edge information of a graph. This means one line looks like the following:

Node_X;Node_Y;5

it is interpreted as an edge or link between nodes x and y having the weight of 5. My mappers get this input. Now what I want to achieve is to aggregate the information using the nodes.

The following example illustrates my scenario:

Node_X;Node_Y;5

Node_X;Node_Z;10

Node_X;Node_A;60

Node_Y;Node_A;20

Then the result by nodes should be:

Node_X;75; Node_Y;25; Node_A;80

I want to collect all distinct nodes and give them as weight the sum of the weights they have with other nodes.

In my mapper, I can read the edge information:

Node_X;Node_A;60

But how can I make two keys out of this line for my reducers? It should be something like

context.write(Node_X,60);

context.write(Node_A,60);

How can I achieve this?

Thx!

P.S.: The edges are undirected.

Hari Menon · Accepted Answer

It should be something like

context.write(Node_X,60);

context.write(Node_A,60);

Assuming you haven't tried it before asking, that will work.

Hadoop MapReduce: Two Keys in one line, but how?

Answers (1)

Related Questions