Kenny Bi
Kenny Bi

Reputation: 55

How can hadoop mapreduce get data input from CSV file?

I want to implement hadoop mapreduce, and I use the csv file for it's input. So, I want to ask, is there any method that hadoop provide for use to get the value of csv file, or we just do it with Java Split String function?

Thanks all.....

Upvotes: 4

Views: 12760

Answers (1)

Ashish
Ashish

Reputation: 5791

By default Hadoop uses a Text Input reader that feeds the mapper line by line from the input file. The key in the mapper is the number of lines read. Be careful with CSV files though, as single columns/fields can contain a line break. You might want to look for a CSV input reader like this one:

https://github.com/mvallebr/CSVInputFormat/blob/master/src/main/java/org/apache/hadoop/mapreduce/lib/input/CSVNLineInputFormat.java

But, you have to split your line in your code.

Upvotes: 5

Related Questions