Reputation: 569
I have a dataframe with IDs and numerous test results relating to each ID. What I want to do is create a second dataframe which summarises the average score and the standard deviation for a particular test, which I can then plot on a graph.
Below is the code I have so far. It returns an error of "ValueError: Length mismatch: Expected axis has 1 elements, new values have 2 elements".
Can anyone help?
df2 = df1.groupby(['id'], as_index=True).agg({'variable_1':['mean'], 'variable_1':['std']})
df2.columns=['var_mean','var_std']
df2.plot(x='var_mean', y='var_std', kind='scatter', figsize=(15,10), title='Standard Deviation of Std vs Mean')
example data:
ID Variable_1
1234 32
1234 23
2345 54
2345 65
2345 76
3456 78
what I'd like:
ID Mean SD
1234 23.5 2.2
2345 45 9
...
...
Upvotes: 5
Views: 3647