Reputation: 6139
In my HDFS i will be doing xml processing . ie processing an xml file and extracting 2 nodes.And this will be my x and y to plot a graph.
How can I do this .Generating graph from hdfs output.I want to use Rapid miner.how can i do this any idea...
OR ELSE
Is there a way to visualize my hadoop data
Upvotes: 0
Views: 192
Reputation: 5184
The way HDFS works is by splitting the file into blocks of predefined size. It just like doing a
split -b 64M file.xml
And takes each block and saves it to a salve datanode. Now if you HDFS has a block size of 64MB and the file size is 1 GB your file will be split into 16 blocks and saved in different location. So a mapreduce job will not be able to make sense out of a xml file block since xml is structured unlike a simple csv or tsv files. So as far as i can see you cannot process a xml file over hdfs if its greater then the hdfs block size.
Upvotes: 1