Reputation: 1079
I'd like to plot a timeseries and show outliers in a different color than the rest of the points. How can I used seaborn lineplot to show a different color for specific points?
When I use `sns.lineplot(data=df, x='datetime', y='y', hue='outlier_label') I get separate series for each category (outlier or inlier). I would like to plot only a single series instead.
x = range(100)
errors = np.random.normal(size=len(x))
y = np.sin(x) + errors
outlier_indices = [10, 20, 30, 40, 50]
for oi in outlier_indices:
y[oi] = y[oi] + 10 * errors[50]
data = pd.DataFrame({"x": x, "y": y})
data["outlier_label"] = np.where(data.index.isin(outlier_indices), 1, 0)
import seaborn as sns
sns.lineplot(data=data, x="x", y="y", hue="outlier_label")
Upvotes: 1
Views: 1240
Reputation: 80459
You could first plot the curve without taking the outlier status into account, and then add a scatter plot with the positions of the outliers.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
x = np.arange(100)
errors = np.random.normal(size=len(x))
y = np.sin(x) + errors
outlier_indices = [10, 20, 30, 40, 50]
for oi in outlier_indices:
y[oi] = y[oi] + 10 * errors[50]
data = pd.DataFrame({"x": x, "y": y})
data["outlier_label"] = np.where(data.index.isin(outlier_indices), 1, 0)
ax = sns.lineplot(data=data, x="x", y="y", label="given curve")
sns.scatterplot(data=data[data["outlier_label"] == 1], x="x", y="y", color="crimson", label="outlier", ax=ax)
plt.tight_layout()
plt.show()
Upvotes: 3