Reputation: 1897
I'm trying to filter out NULL and empty string from my data
data_filtered = FILTER raw_data by COLUMN_NAME is not null and COLUMN_NAME != '' ;
When I run this, I get the following error:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file jhoughton/temp/temp_script.pig, line 43, column 46> Unexpected character ' '
How can I resolve this error and filter out both NULLS and blank strings?
Upvotes: 3
Views: 5093
Reputation: 31
you can use TRIM function to filter empty spaces
data_filtered = FILTER raw_data by ( COLUMN_NAME is not null and TRIM(COLUMN_NAME) != '' );
Upvotes: 2
Reputation: 119
(In-)Equality for Strings is not established through != or == in Pig.
The correct syntax is:
data_filtered = FILTER raw_data BY (COLUMN_NAME is not null) AND NOT(COLUMN_NAME MATCHES "");
Upvotes: 1