pandas fillna on specific part of dataframe does not work as intended

Question

I'm trying to do fill in missing values on a car dataset.

my dataset has the following columns: name, seats, mileage, price along with 10 other columns.

For instance, the seats column has some missing values, to fill in the nan values I plan on looking at the corresponding name column first to get the name of the car, find how many seats that car usually has and replace all the nan values with it.

Here is my code:

seat_cars = df[df['seats'].isnull()]['name'].unique()

for car in seat_cars:
    mode = df.loc[df['name'] == car, 'seats'].mode()          #returns a series
    if mode.empty == False:
        df.loc[df['name'] == car, 'seats'].fillna(mode[0], inplace = True)

But this approach doesn't seem to work as the non-null values count did not change when I do df.info(). In some columns, this method seems to increase the nan count in a column.

What am i getting wrong over here ? Any help is appreciated.

Edit: I changed my code to this-

def fillwithmode(s):
    mode = s.mode()
    if mode.empty == False:
        s.fillna(mode[0])
    return s
    
df['seats'] = df.groupby('name')['seats'].apply(lambda x : fillwithmode(x))

but that still does not seem to fill in missing values

pandas fillna on specific part of dataframe does not work as intended

Answers (1)

Related Questions