Reputation: 597
I have an example of the following data
id : long,
list: {(itemId: Long, itemName: charArray)}
In my data, list can either be a bag of tuples or null. I would like to change the null into an empty bag (consisting of 0 elements)
I tried something like :
answer = FOREACH data
GENERATE (list is null ? {} : list) AS list;
It says that {} and list are not compatible schemas. I am wondering how I can create an empty bag with a compatible schema
I ended up doing this and it worked:
answer = FOREACH data
GENERATE (list is null ? (bag{tuple(long,chararray)}){} : list) AS list:{(itemId: long, itemName: charArray)};
Upvotes: 3
Views: 6996
Reputation: 1279
If you have the chance to use it, take a look at Apache DataFu project: http://datafu.incubator.apache.org
It has lots of useful stuff for Pig, including datafu.pig.bags.NullToEmptyBag()
, which does exactly what you are looking for:
DEFINE NullToEmptyBag datafu.pig.bags.NullToEmptyBag();
...
answer = FOREACH data GENERATE NullToEmptyBag(list) AS list...;
Upvotes: 3
Reputation: 711
empty tuple should be ()
empty bag should be {}
empty map should be []
Upvotes: 0
Reputation: 39893
{}
has no types as is. Bags always have a tuple type inside of it. list
and your empty bag need to have the same type.
I unfortunately don't have Pig up in a way that I can test this for you and I'm not sure exactly how to do it, but it's going to be something along the lines of this... I couldn't find good documentation on how to set the type of a bag...
Try this perhaps?
answer = FOREACH data
GENERATE (list is null ? (bag{tuple(long,chararray)}){} : list) AS list;
Upvotes: 5