Mzzzzzz
Mzzzzzz

Reputation: 4970

Filter by length of array in Pig

I have data stored in avro format. One of the fields of each record (array_field, say) is an array. Using Pig how do I obtain only the records that have arrays with, for example, length(array_field) >= 2 and then store the results in avro files using the same schema as the original input?

Upvotes: 0

Views: 1044

Answers (1)

Mikko Kupsu
Mikko Kupsu

Reputation: 371

This should be doable with something like code below:

A = LOAD '$INPUT' USING AvroStorage();
B = FILTER A BY SIZE(array_field) >= 2;
STORE B INTO '$OUTPUT' USING AvroStorage('schema', '<schema_here>');

Upvotes: 1

Related Questions