Reputation: 492
I have a geopandas dataframe where there are NaN in a column. I want to impute the NaN with the average value of its neighbors. I made up the following example and would appreciate it if somebody can help me out with the final steps. Thanks.
# Load libraries
import numpy as np
import pandas as pd
import geopandas as gpd
from libpysal.weights.contiguity import Queen
# Make data
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
africa = world[world['continent'] == 'Africa']
africa.reset_index(inplace=True, drop=True)
africa.loc[[2,8,15,22,30,35,39,43],'pop_est'] = np.nan # Make NaN value for pop_est
africa
# Generate weight
w = Queen.from_dataframe(africa)
w.neighbors[2] # Check neighbors of index 2
For example, index 2
has a missing value on population estimate and its neighbors are [0, 35, 36, 48, 49, 50, 27, 28, 31]
. I want to use the mean of population estimate from [0, 35, 36, 48, 49, 50, 27, 28, 31]
to replace NaN. Thanks.
Upvotes: 0
Views: 201