S S
S S

Reputation: 235

filtering not nulls and blanks in pyspark

i want to filter data which is not na's and blanks.

when i try to filter i'm getting datatype mismatch how exactly it works

cust.filter((~cust.CODES.isna())|(~cust.CODES==""))

CODES looks some thing like this
----------
ABC43719

EFG43719 EFG437

i'm getting this error. i tried isNull() as well

AnalysisException: cannot resolve '((`CODES` IS NULL) OR `CODES`)' due to data type mismatch: differing types in '((`CODES` IS NULL) OR `CODES`)' (boolean and string).;;
'Filter NOT ((isnull(CODES#2456) OR CODES#2456) = )

and

'Column' object is not callable

Upvotes: 0

Views: 543

Answers (1)

mck
mck

Reputation: 42332

Use isNull. isna is pandas syntax and not usable in pyspark. In the second condition, use !=.

cust.filter((~cust.CODES.isNull())|(cust.CODES!=""))

Upvotes: 1

Related Questions