Reputation: 395
I'm trying to do fill in missing values on a car dataset.
my dataset has the following columns: name, seats, mileage, price
along with 10 other columns.
For instance, the seats
column has some missing values, to fill in the nan
values I plan on looking at the corresponding name
column first to get the name of the car, find how many seats that car usually has and replace all the nan
values with it.
Here is my code:
seat_cars = df[df['seats'].isnull()]['name'].unique()
for car in seat_cars:
mode = df.loc[df['name'] == car, 'seats'].mode() #returns a series
if mode.empty == False:
df.loc[df['name'] == car, 'seats'].fillna(mode[0], inplace = True)
But this approach doesn't seem to work as the non-null values count did not change when I do df.info()
. In some columns, this method seems to increase the nan count in a column.
What am i getting wrong over here ? Any help is appreciated.
Edit: I changed my code to this-
def fillwithmode(s):
mode = s.mode()
if mode.empty == False:
s.fillna(mode[0])
return s
df['seats'] = df.groupby('name')['seats'].apply(lambda x : fillwithmode(x))
but that still does not seem to fill in missing values
Upvotes: 0
Views: 636
Reputation: 13821
IIUC you want to fill the null values per car name with each name's mode value, if that you can use groupby
and fillna
:
# Initial DF
print(df)
name seats mileage price
0 a NaN 72 37095
1 a 3.0 78 20039
2 a 3.0 21 37002
3 a NaN 79 43251
4 b 3.0 41 31115
5 b 3.0 77 30717
6 b 5.0 73 28443
7 b NaN 20 40532
8 c 4.0 85 21792
9 c 4.0 51 26383
10 c 4.0 56 29391
11 c NaN 77 42427
12 d 2.0 53 25393
13 d NaN 67 22605
# Fill nulls
df.assign(
seats = df.groupby(
['name']
).seats.apply(
lambda x: x.fillna(x.mode()[0])
)
)
Out[18]:
name seats mileage price
0 a 3.0 72 37095
1 a 3.0 78 20039
2 a 3.0 21 37002
3 a 3.0 79 43251
4 b 3.0 41 31115
5 b 3.0 77 30717
6 b 5.0 73 28443
7 b 3.0 20 40532
8 c 4.0 85 21792
9 c 4.0 51 26383
10 c 4.0 56 29391
11 c 4.0 77 42427
12 d 2.0 53 25393
13 d 2.0 67 22605
Don't forget to assign back when you use assign as it returns a copy.
Upvotes: 1