Glaslos
Glaslos

Reputation: 2923

frequency trail in matplotlib

I'm looking into outliers detection. Brendan Gregg has a really nice article and I'm especially intrigued by his visualizations. One of the methods he uses are frequency trails.

frequency trails

I'm trying to reproduce this in matplotlib using this example. Which looks like this:

polys3d_demo

And the plot is based on this answer: https://stackoverflow.com/a/4152016/948369

Now my issue is, like described by Brendan, that I have a continuous line that masks the outlier (I simplified the input values so you can still see them):

masked outlier

Any help on making the line "non-continuous" for non existent values?

Upvotes: 5

Views: 4318

Answers (2)

Lucas van Dijk
Lucas van Dijk

Reputation: 849

Seaborn also provides a very neat example:

Seaborn KDE joyplot

They call it a joy/ridge plot however: https://seaborn.pydata.org/examples/kde_ridgeplot.html

#!/usr/bin/python
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white", rc={"axes.facecolor": (0, 0, 0, 0)})

# Create the data
rs = np.random.RandomState(1979)
x = rs.randn(500)
g = np.tile(list("ABCDEFGHIJ"), 50)
df = pd.DataFrame(dict(x=x, g=g))
m = df.g.map(ord)
df["x"] += m

# Initialize the FacetGrid object
pal = sns.cubehelix_palette(10, rot=-.25, light=.7)
g = sns.FacetGrid(df, row="g", hue="g", aspect=15, size=.5, palette=pal)

# Draw the densities in a few steps
g.map(sns.kdeplot, "x", clip_on=False, shade=True, alpha=1, lw=1.5, bw=.2)
g.map(sns.kdeplot, "x", clip_on=False, color="w", lw=2, bw=.2)
g.map(plt.axhline, y=0, lw=2, clip_on=False)

# Define and use a simple function to label the plot in axes coordinates
def label(x, color, label):
    ax = plt.gca()
    ax.text(0, .2, label, fontweight="bold", color=color, 
            ha="left", va="center", transform=ax.transAxes)

g.map(label, "x")

# Set the subplots to overlap
g.fig.subplots_adjust(hspace=-.25)

# Remove axes details that don't play will with overlap
g.set_titles("")
g.set(yticks=[])
g.despine(bottom=True, left=True)

Upvotes: 6

Hooked
Hooked

Reputation: 88198

I would stick with a flat 2D plot and displace each level by a set vertical amount. You'll have to play the the levels (in the code below I called it displace) to properly see the outliers, but this does a pretty good job at replicating your target image. The key, I think, is to set the "zero" values to None so pylab does not draw them.

enter image description here

import numpy as np
import pylab as plt
import itertools

k = 20
X = np.linspace(0, 20, 500)
Y = np.zeros((k,X.size))

# Add some fake data
MU = np.random.random(k)
for n in xrange(k):
    Y[n] += np.exp(-(X-MU[n]*n)**2 / (1+n/3))
Y *= 50

# Add some outliers for show
Y += 2*np.random.random(Y.shape)

displace = Y.max()/4

# Add a cutoff
Y[Y<1.0] = None

face_colors = itertools.cycle(["#D3D820", "#C9CC54", 
                               "#D7DA66", "#FDFE42"])

fig = plt.figure()
ax = fig.add_subplot(111, axisbg='black')
ax.xaxis.set_visible(False)
ax.yaxis.set_visible(False)

for n,y in enumerate(Y):
    # Vertically displace each plot
    y0 = np.ones(y.shape) * n * displace
    y1 = y + n*displace

    plt.fill_between(X, y0,y1,lw=1, 
                     facecolor=face_colors.next(),
                     zorder=len(Y)-n)  
plt.show()

Upvotes: 4

Related Questions