Reputation: 1019
I am trying out Seaborn to make my plot visually better than matplotlib. I have a dataset which has a column 'Year' which I want to plot on the X-axis and 4 Columns say A,B,C,D on the Y-axis using different coloured lines. I was trying to do this using the sns.lineplot method but it allows for only one variable on the X-axis and one on the Y-axis. I tried doing this
sns.lineplot(data_preproc['Year'],data_preproc['A'], err_style=None)
sns.lineplot(data_preproc['Year'],data_preproc['B'], err_style=None)
sns.lineplot(data_preproc['Year'],data_preproc['C'], err_style=None)
sns.lineplot(data_preproc['Year'],data_preproc['D'], err_style=None)
But this way I don't get a legend in the plot to show which coloured line corresponds to what. I tried checking the documentation but couldn't find a proper way to do this.
Upvotes: 71
Views: 214546
Reputation: 23
Realizing this is old, and the "better" approach is with long data, but it's also easy to get a legend with labels by modifying the original post's attempted approach.
import seaborn as sns
import matplotlib.pyplot as plt
sns.lineplot(data_preproc['Year'],data_preproc['A'], label='A')
sns.lineplot(data_preproc['Year'],data_preproc['B'], label='B')
sns.lineplot(data_preproc['Year'],data_preproc['C'], label='C')
sns.lineplot(data_preproc['Year'],data_preproc['D'], label='D')
plt.legend()
plt.show()
Upvotes: 0
Reputation: 3105
Seaborn favors the "long format" as input. The key ingredient to convert your DataFrame from its "wide format" (one column per measurement type) into long format (one column for all measurement values, one column to indicate the type) is pandas.melt. Given a data_preproc
structured like yours, filled with random values:
num_rows = 20
years = list(range(1990, 1990 + num_rows))
data_preproc = pd.DataFrame({
'Year': years,
'A': np.random.randn(num_rows).cumsum(),
'B': np.random.randn(num_rows).cumsum(),
'C': np.random.randn(num_rows).cumsum(),
'D': np.random.randn(num_rows).cumsum()})
# Convert the dataframe from wide to long format
dfl = pd.melt(data_preproc, ['Year'])
A single plot with four lines, one per measurement type, is obtained with
sns.lineplot(data=dfl, x='Year', y='value', hue='variable')
(Note that 'value' and 'variable' are the default column names returned by melt
, and can be adapted to your liking.)
Upvotes: 109
Reputation: 11
you can set 'year' as the index using data_preproc.set_index("year",inplace=True)
, then the answer above sns.lineplot(data=data_preproc)
will directly work.
Upvotes: 1
Reputation: 2599
See the documentation:
sns.lineplot(x="Year", y="signal", hue="label", data=data_preproc)
You probably need to re-organize your dataframe in a suitable way so that there is one column for the x data, one for the y data, and one which holds the label for the data point.
You can also just use matplotlib.pyplot
. If you import seaborn
, much of the improved design is also used for "regular" matplotlib plots. Seaborn is really "just" a collection of methods which conveniently feed data and plot parameters to matplotlib.
Upvotes: 19