Brewgrammer
Brewgrammer

Reputation: 13

Python ViolinPlots

I'm currently trying to make a violin plot with this data set.

I want the x axis to be the first column (Time) which is in total seconds. and the left half of the violin to be the AskBidAvg while the right half GrpAvg is there a different method than using Hue because from the examples i saw it only takes a column which has 2 unique values. however we have many different values in there. which is causing the problem. we are using 1 minute increments which i calculate from the total seconds(). either in seaborn or matplotlib. the current code I'm using is:

sns.violinplot(x="Time",hue=["AskBidAvg", "GrpAvg"] ,inner ="quartiles" , linewidth= 1,split=True , data=df )

However, it throws an error that hue cannot be more than 2 values.

Upvotes: 0

Views: 357

Answers (1)

James Natale
James Natale

Reputation: 496

In order to plot what you want, you need to perform a little transform of your data.

You have the time column - that one is fine Your second column should hold the y values (it will have all of the number values) Then there should be another column which tells you whether it is AskBidAvg, or GrpAvg

    Time    variable    value
0   18000   AskBidAvg   -0.000019
1   18000   AskBidAvg   -0.000024
2   18000   AskBidAvg   0.000019
...     ...     ...     ...
76  18004   GrpAvg  -0.000019
77  18005   GrpAvg  -0.000005
78  18005   GrpAvg  -0.000012
79  18005   GrpAvg  0.000002

Pandas has a nice function which can do this for you.

import pandas as pd
df = pd.read_csv("/Users/james.natale/Downloads/yourdata.csv",index_col=False,header=0)
df = pd.melt(df, id_vars=['Time'], value_vars=['AskBidAvg', 'GrpAvg'])

import seaborn as sns
sns.set(style="whitegrid", palette="pastel", color_codes=True)

# Draw a nested violinplot and split the violins for easier comparison
sns.violinplot(x=df['Time'], y=df['value'], hue=df['variable'], split=True,
               inner="quart")
sns.despine(left=True)

Upvotes: 2

Related Questions