Reputation: 53
I am trying to get mahout working and I am getting the following error :
3/05/16 22:48:53 INFO mapred.MapTask: record buffer = 262144/327680
13/05/16 22:48:53 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NumberFormatException: For input string: "1119"
at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
at java.lang.Long.parseLong(Long.java:430)
at java.lang.Long.parseLong(Long.java:483)
at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:47)
at org.apache.mahout.cf.taste.hadoop.item.ItemIDIndexMapper.map(ItemIDIndexMapper.java:31)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
13/05/16 22:48:54 INFO mapred.JobClient: map 0% reduce 0%
13/05/16 22:48:54 INFO mapred.JobClient: Job complete: job_local_0001
13/05/16 22:48:54 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.io.FileNotFoundException: File does not exist: /user/eric.waite/temp/preparePreferenceMatrix/numUsers.bin
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1843)
at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1834)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:578)
My input file is very simple : (sample) userid, storyId, rating (1-5)
2840281,1119,2
2840321,1170,3
2840323,1124,5
2840371,1170,5
2840347,1157,3
2840371,1172,5
2840347,1157,5
2840358,1333,5
2840371,1172,5
2840347,1157,5
I am trying to run a basic example using the following command :
hadoop jar /sourcecode/mahout/mahout-distribution-0.7/mahout-core-0.7-job.jar org.apache.mahout.cf.taste.hadoop.item.RecommenderJob -s SIMILARITY_COOCCURRENCE --input ratings.dat --output output
Java information:
java version "1.7.0_13" Java(TM) SE Runtime Environment (build 1.7.0_13-b20) Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode) I am on a mac 10.8.2
Does anyone have any suggestions on why the integer is being read as a string and is generating the NumberFormatException
?
Thank you.
Upvotes: 4
Views: 2926
Reputation: 558
You can debug the RecommendJob and check where the exception occurs and check the actual string value, maybe some blank or useless character in the input file. I also have this exception, and my exception occurs here:
String[] tokens = TasteHadoopUtils.splitPrefTokens(value.toString());
long itemID = Long.parseLong(tokens[transpose ? 0 : 1]);
Upvotes: 0
Reputation: 66886
You likely have some non-printing character funny business in here. The string it shows, of course, parses just fine as a long. (The quotes are only part of its error message.)
To see what I mean, try
System.out.println(Long.parseLong("\u00001119"));
It fails with the same error, one that is on its face puzzling.
Not sure how to debug this easily short of a hex editor.
Upvotes: 1