Reputation: 33223
I have json file in follwoing format:
{ "_id" : "foo.com", "categories" : [], "h1" : { "bar==" : { "first" : 1281916800, "last" : 1316995200 }, "foo==" : { "first" : 1281916800, "last" : 1316995200 } }, "name2" : [ "foobarl.com", "foobar2.com" ], "rep" : null }
So, how do i parse this json in pig..
also, the categories and rep can have some char in it..and might not be always empty. I made the following attempt.
a = load 'sample_json.json' using JsonLoader('id:chararray,categories:[chararray], hostt:{ (variable_a: {(first:int,last:int)})}, ns:[chararray],rep:chararray ');
But i get this error:
org.codehaus.jackson.JsonParseException: Unexpected character ('D' (code 68)): expected a valid value (number, String, array, object, 'true', 'false' or 'null') at [Source: java.io.ByteArrayInputStream@4795b8e9; line: 1, column: 50] at org.codehaus.jackson.JsonParser._constructError(JsonParser.java:1291) at org.codehaus.jackson.impl.JsonParserMinimalBase._reportError(JsonParserMinimalBase.java:385) at org.codehaus.jackson.impl.JsonParserMinimalBase._reportUnexpectedChar(JsonParserMinimalBase.java:306) at org.codehaus.jackson.impl.Utf8StreamParser._handleUnexpectedValue(Utf8StreamParser.java:1582) at org.codehaus.jackson.impl.Utf8StreamParser.nextToken(Utf8StreamParser.java:386) at org.apache.pig.builtin.JsonLoader.readField(JsonLoader.java:173) at org.apache.pig.builtin.JsonLoader.getNext(JsonLoader.java:157) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:211) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:532) at org.apache.hadoop.mapreduce.MapContext.nextKeyValue(MapContext.java:67) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:143) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:212)
Upvotes: 3
Views: 782
Reputation: 359
You can use elephant bird pig jar for parsing json. It can parse all sort of json data. Here are certain examples for parsing json via elephant bird pig using this jar. https://github.com/twitter/elephant-bird/tree/master/examples/src/main/pig
It doesn't break even if an expected json tag isn't present.
Upvotes: 3