Reputation: 13
I'm currently trying to make a violin plot with this data set.
I want the x axis to be the first column (Time) which is in total seconds. and the left half of the violin to be the AskBidAvg while the right half GrpAvg is there a different method than using Hue because from the examples i saw it only takes a column which has 2 unique values. however we have many different values in there. which is causing the problem. we are using 1 minute increments which i calculate from the total seconds(). either in seaborn or matplotlib. the current code I'm using is:
sns.violinplot(x="Time",hue=["AskBidAvg", "GrpAvg"] ,inner ="quartiles" , linewidth= 1,split=True , data=df )
However, it throws an error that hue cannot be more than 2 values.
Upvotes: 0
Views: 357
Reputation: 496
In order to plot what you want, you need to perform a little transform of your data.
You have the time column - that one is fine Your second column should hold the y values (it will have all of the number values) Then there should be another column which tells you whether it is AskBidAvg, or GrpAvg
Time variable value
0 18000 AskBidAvg -0.000019
1 18000 AskBidAvg -0.000024
2 18000 AskBidAvg 0.000019
... ... ... ...
76 18004 GrpAvg -0.000019
77 18005 GrpAvg -0.000005
78 18005 GrpAvg -0.000012
79 18005 GrpAvg 0.000002
Pandas has a nice function which can do this for you.
import pandas as pd
df = pd.read_csv("/Users/james.natale/Downloads/yourdata.csv",index_col=False,header=0)
df = pd.melt(df, id_vars=['Time'], value_vars=['AskBidAvg', 'GrpAvg'])
import seaborn as sns
sns.set(style="whitegrid", palette="pastel", color_codes=True)
# Draw a nested violinplot and split the violins for easier comparison
sns.violinplot(x=df['Time'], y=df['value'], hue=df['variable'], split=True,
inner="quart")
sns.despine(left=True)
Upvotes: 2