Super_John
Super_John

Reputation: 1897

pig: filtering out empty string

I'm trying to filter out NULL and empty string from my data

data_filtered = FILTER raw_data by COLUMN_NAME is not null and COLUMN_NAME != '' ;

When I run this, I get the following error:

ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: <file jhoughton/temp/temp_script.pig, line 43, column 46>  Unexpected character ' '

How can I resolve this error and filter out both NULLS and blank strings?

Upvotes: 3

Views: 5093

Answers (2)

user4905630
user4905630

Reputation: 31

you can use TRIM function to filter empty spaces

data_filtered = FILTER raw_data by ( COLUMN_NAME is not null and TRIM(COLUMN_NAME) != '' );

Upvotes: 2

Bryan
Bryan

Reputation: 119

(In-)Equality for Strings is not established through != or == in Pig.

The correct syntax is:

data_filtered = FILTER raw_data BY (COLUMN_NAME is not null) AND  NOT(COLUMN_NAME MATCHES "");

Upvotes: 1

Related Questions