Philipp
Philipp

Reputation: 4799

Find mode of kernel density estimate in GPS data

I'm analyzing GPS location data with weights indicating "importance". This can be easily plotted as heatmap, e.g. in google maps. I would like to analyze this using the python data stack and, in particular, want to find the mode of a kernel density estimate.

How can I compute the mode of a KDE in python?

Very specifically, given the example at https://scikit-learn.org/stable/auto_examples/neighbors/plot_species_kde.html how would you find the location with highest probability of finding the "Bradypus variegatus" species?

Upvotes: 1

Views: 1163

Answers (1)

bubble
bubble

Reputation: 1672

Lets consider a simple example of getting kde-estimation:

import numpy as np
from scipy.stats import gaussian_kde
from pylab import plt

np.random.seed(10)

x = np.random.rand(100)
y = np.random.rand(100)
kde = gaussian_kde(np.vstack([x, y]))
X, Y = np.meshgrid(np.linspace(0, 1, 100), np.linspace(0, 1, 100))
Z = kde(np.vstack([X.ravel(), Y.ravel()])).reshape(X.shape)

plt.contourf(X, Y, Z)
plt.show()

enter image description here

Now, we can get the coordinates X and Y, where Z takes its maximal value:

X.ravel()[np.argmax(Z.ravel())]

0.3535353535353536

Y.ravel()[np.argmax(Z.ravel())]

0.5555555555555556

In practice, when estimating locations of highest probability of occurrence some species, you need not the only one position, but some area around it. In this case, you can choose, for example, all locations, where the probability is greater than 90 percentile of all possible probability values, e.g.

Y.ravel()[Z.ravel() > np.percentile(Z, 90)]
X.ravel()[Z.ravel() > np.percentile(Z, 90)]

In case of cited example, you can try the same approach to get desired result. Probably, you will need to tweak threshold value, e.g. choose 75-percentile instead of 90-percentile value.

Upvotes: 2

Related Questions