Reputation: 4970
I have data stored in avro format. One of the fields of each record (array_field
, say) is an array. Using Pig how do I obtain only the records that have arrays with, for example, length(array_field) >= 2
and then store the results in avro files using the same schema as the original input?
Upvotes: 0
Views: 1044
Reputation: 371
This should be doable with something like code below:
A = LOAD '$INPUT' USING AvroStorage();
B = FILTER A BY SIZE(array_field) >= 2;
STORE B INTO '$OUTPUT' USING AvroStorage('schema', '<schema_here>');
Upvotes: 1