Replacing string values with mean of column in dataframe

Question

I have a data file I load and process with a pandas Dataframe. My code works, but I'm wondering if there is a more efficient way to achieve what I'm trying to do. My code is as follows:

df = pd.read_csv("file_name.data", sep="\s+", names=["A","B","Horsepower"])
df1 = df[df.Horsepower != '?']
df2 = df1["Horsepower"].apply(pd.to_numeric)
df.replace('?', df2.mean())

In the data itself, the Horsepower column comes with several missing values that have been replaced with '?'. The above code replaces these '?' values with the mean of the Horsepower column, excluding the '?' values.

With that established, is there a more efficient way to replace the '?' values in "Horsepower" with the mean of the "Horsepower" column?

Replacing string values with mean of column in dataframe

Answers (1)

Related Questions