Replacing null values in Pandas data frame with a series

Question

I created a function to replace missing values with knn in Python, following is my function:

def missing_variables_knn(x):
    test = data[data[x].isnull()]
    train = data[data[x].isnull()==False] 
    X_train = train.loc[:, ['ApplicantIncome', 'CoapplicantIncome', 'LoanAmount', 'Loan_Amount_Term']]
    Y_train = train[x]
    X_test = test.loc[:, ['ApplicantIncome', 'CoapplicantIncome', 'LoanAmount', 'Loan_Amount_Term']]
    knn = KNeighborsClassifier(n_neighbors=3)
    knn.fit(X_train, Y_train)
    pred = knn.predict(X_test)
    pred = pd.Series(pred)
    data[x].fillna(pred)

When I used missing_variables_knn('Gender'), I got an error:

The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Replacing null values in Pandas data frame with a series

Answers (1)

Related Questions