How do I remove duplicate rows in a Pandas DataFrame based on values in a specific column?

Question

I have two dataframes that have duplicates but I need to remove only the rows that have duplicate VIN numbers and doesn't look at the other cells.

0  230  5UXCR6C50KTQ4xxxx  KLL34607  2019 BMW M3                        
1  116  5UXCR4C00LLW6xxxx  LLW63494  2020 BMW X5           Not Found      
2  109  5UXCR6C06LLL7xxxx  LLL76916  2020 BMW X5         Need Detail

38  229  5UXCR6C50KLL3xxxx  MLL23650  2019 BMW X5                    
43  115  5UXCR4C00LLW6xxxx  LLW63494  2020 BMW X5                    
37  108  5UXCR6C06LLL7xxxx  LLL76916  2020 BMW X5

The last 2 rows look like different rows to pandas but I need to merge the two data frames and remove the rows just based on those VIN numbers and ignores the 'Not Found' and 'Need Detail'

I've tried .drop_duplicates .cumsum() and a few other methods but nothing seems to work.

How do I remove duplicate rows in a Pandas DataFrame based on values in a specific column?

Answers (1)

Related Questions