Neo
Neo

Reputation: 492

impute NaN with neighbors' average for a geopandas dataframe

I have a geopandas dataframe where there are NaN in a column. I want to impute the NaN with the average value of its neighbors. I made up the following example and would appreciate it if somebody can help me out with the final steps. Thanks.

# Load libraries
import numpy as np
import pandas as pd
import geopandas as gpd
from libpysal.weights.contiguity import Queen

# Make data
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
africa = world[world['continent'] == 'Africa']
africa.reset_index(inplace=True, drop=True)
africa.loc[[2,8,15,22,30,35,39,43],'pop_est'] = np.nan # Make NaN value for pop_est
africa

# Generate weight
w = Queen.from_dataframe(africa)
w.neighbors[2] # Check neighbors of index 2

For example, index 2 has a missing value on population estimate and its neighbors are [0, 35, 36, 48, 49, 50, 27, 28, 31]. I want to use the mean of population estimate from [0, 35, 36, 48, 49, 50, 27, 28, 31] to replace NaN. Thanks.

Upvotes: 0

Views: 201

Answers (1)

Neo
Neo

Reputation: 492

I finally figured out how to do it.

Upvotes: 0

Related Questions