Kamikaze K
Kamikaze K

Reputation: 191

Smoothness of a dataset with pandas

I have a sample of example of two data as below;

enter image description here

If I was to plot the two, Data A would have a much smoother line graph and data B would have more spikey graph. How can I use pandas to sort of deternimne/differentiate the smoothness of dataset e.g with a calculation on the data and giving it an index which I can equate to the smoothness f the data. I looked for a solution and there was a suggestion using difference of Standard deviation. This was based on R. Any ideas on this? What sort of calculation would give me what i want? Can anyone point me in the right direction?

Upvotes: 1

Views: 968

Answers (1)

Esuom
Esuom

Reputation: 58

Standard deviation doesn't necessarily mean smoothness in the sense you seem to mean. A straight line graph (y=x) A:1 B:2 C:3 D:4 would be smooth for what you mean right? Whereas A:4 B:1 C:3 B:2 would not (it would go up and down/change direction). I think what you are looking for is a change of slope calculation (derivative of the function at different points or gradient).

In this case it's actually quite simple. Just calculate the sum of the absolute difference between each point. The one with the greatest total is more "spikey".

You can shift the data (pandas.shift), subtract the shift from the original, take the absolute value and then the sum.

Upvotes: 3

Related Questions