Balthasar
Balthasar

Reputation: 323

Adaptive Bandwidth Kernel Density Estimation

There seems to be a wealth of information and tools available for the implementation of standard multivariate or univariate kernel density estimation. However, the discrete geographic data I am currently working with is especially sparse and tends to cluster around areas of high population density.

That is to say, I have a number of points (longitude and latitude) on a map, and I would like to estimate a probability density given the points, but I need to somehow normalize for population density. From looking around, it seems as though the proper method for this type of problem would be to implement some sort of nearest-neighbor adaptive bandwidth for the kernel estimation. Yet, it seems as though the stats.gaussian_kde does not support adaptive bandwidth. Is anyone aware of how I might be able to implement this myself, or if there are any packages available for adaptive bandwidth KDE's?

Upvotes: 9

Views: 5435

Answers (1)

Gabriel
Gabriel

Reputation: 42459

I came across this question searching for variable/adaptive kernel density estimation packages in Python. I realize the OP has probably long moved on, but here's what I've found anyway:

  • AdaptiveKDE (Python module for adaptive kernel density estimation)

    This package implements adaptive kernel density estimation algorithms for 1-dimensional signals developed by Hideaki Shimazaki. This enables the generation of smoothed histograms that preserve important density features at multiple scales, as opposed to naive single-bandwidth kernel density methods that can either over or under smooth density estimates.

  • awkde (Adaptive Width KDE with Gaussian Kernels)

    The kernel bandwith is choosen locally to account for variations in the density of the data. Areas with large density gets smaller kernels and vice versa. This smoothes the tails and gets high resolution in high statistics regions.

    This uses the awesome pybind11 package which makes creating C++ bindings super convenient. Only the evaluation is written in a small C++ snippet to speed it up, the rest is a pure python implementation.

This last one does not have an adaptive method, but includes an algorithm that is well suited for multimodal distributions.

  • KDEpy (Kernel Density Estimation in Python)

    This Python 3.5+ package implements various kernel density estimators (KDE). Three algorithms are implemented through the same API: NaiveKDE, TreeKDE and FFTKDE. The class FFTKDE outperforms other popular implementations, see the comparison page.

Upvotes: 11

Related Questions