Enzo Ferrari
Enzo Ferrari

Reputation: 23

What does the point parameter do to Violin plots in matplotlib

What happens when points parameter in matplotlib's violinplot is changed and when would it be useful to do so?

points parameter of violin plots is defined as follows.

points : scalar, default = 100

Defines the number of points to evaluate each of the gaussian kernel density estimations at.

Axes.violinplot(self, dataset, positions=None, vert=True, widths=0.5, showmeans=False, showextrema=True, showmedians=False, points=100, bw_method=None, *, data=None)

I see very little change on my graphs when I change it. Why is that the case?

Upvotes: 2

Views: 710

Answers (1)

Sheldore
Sheldore

Reputation: 39052

As per the [official docs] (second emphasis mine)(https://matplotlib.org/3.1.0/api/_as_gen/matplotlib.axes.Axes.violinplot.html)

points : scalar, default = 100 Defines the number of points to evaluate each of the gaussian kernel density estimations at.

So, as the following example (adapted from here) demonstrates, the effect of number of points is highlighted when chosen a very small number of points. The outcome of course also depends on the sample size. Try choosing a smaller sample size like size=5 and run the same code below. As you increase the points, the smoothness of the density estimation naturally improves. At some cutoff number of points, which is subject to convergence tests, you will see no noticeable impact.

import numpy as np
import matplotlib.pyplot as plt

# Fixing random state for reproducibility
np.random.seed(19680801)


fs = 10  # fontsize
pos = [1, 2, 4, 5, 7, 8]
data = [np.random.normal(0, std, size=100) for std in pos]

fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(10, 3))

axes[0].violinplot(data, pos, points=2, widths=1,
                      showmeans=True, showextrema=True, showmedians=True)
axes[0].set_title('Custom violinplot 1', fontsize=fs)

axes[1].violinplot(data, pos, points=5, widths=1,
                      showmeans=True, showextrema=True, showmedians=True,
                      bw_method='silverman')
axes[1].set_title('Custom violinplot 2', fontsize=fs)

axes[2].violinplot(data, pos, points=100, widths=1,
                      showmeans=True, showextrema=True, showmedians=True,
                      bw_method='silverman')
axes[2].set_title('Custom violinplot 2', fontsize=fs)

for ax in axes.flat:
    ax.set_yticklabels([])

plt.tight_layout()

enter image description here

P.S: To highlight the point further, consider just a single location and three cases: 5, 10 and 50 points as suggested by @ImportanceOfBeingEarnest

pos = [1]
data = [np.random.normal(0, std, size=100) for std in pos]

enter image description here

Upvotes: 3

Related Questions