Find and remove matching column values in pyspark

Question

I have a pyspark dataframe where occasionally the columns will have a wrong value that matches another column. It would look something like this:

| Date         | Latitude      |
| 2017-01-01   | 43.4553       |
| 2017-01-02   | 42.9399       |
| 2017-01-03   | 43.0091       |
| 2017-01-04   | 2017-01-04    |

Obviously, the last Latitude value is incorrect. I need to remove any and all rows that are like this. I thought about using .isin() but I can't seem to get it to work. If I try

df['Date'].isin(['Latitude'])

I get:

Column<(Date IN (Latitude))>

Any suggestions?

Find and remove matching column values in pyspark

Answers (1)

Related Questions