vero
vero

Reputation: 1015

How to do 2 tests when filtering a RDD in pyspark?

I have 2 parameters:

NB_line =10
NB2_line=11

I have a python function, where I did a test of a number of the lines in my dataframe if is not OK. The dataframe that take 2 cases of number of lines, is NB_line=10 or NB2_line=11.

in the begin it was like this my dataframe:

rddLignesErreur=rddstats.filter(lambda x : len(x) != NB_line)

After evolution of a use case, I modified it like this:

rddLignesErreur=rddstats.filter(lambda x : len(x) != NB_line or len(x) != NB2_line)

Is it true or I or no ? ==> I'm beginning in python.

Thank you

Upvotes: 0

Views: 29

Answers (1)

user10186109
user10186109

Reputation: 26

Why not just use not in?

lambda x: len(x) not in {NB_line, NB2_line}

Upvotes: 1

Related Questions