Reputation: 67
BLUF: (Using Python 3.0) In increments of 0.25, I want to calculate and store the mean/std of a range of values so I can plot them later or do further analysis.
Calculating the mean/std is easy, but I cannot quite get the algorithm right to iterate properly across the range of values.
Data: https://www.dropbox.com/s/y78pynq9onyw9iu/Data.csv?dl=0
What I have so far is normalized toy data that looks like a shotgun blast with one of the target areas isolated between the black lines with an increment of 0.25:
import csv
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
import seaborn as sns
Data=pd.read_csv("Data.csv")
g = sns.jointplot(x="x", y="y", data=Data)
bottom_lim = 0
top_lim = 0.25
temp = Data.loc[(Data.y>=bottom_lim)&(Data.y<top_lim)]
g.ax_joint.axhline(top_lim, c='k', lw=2)
g.ax_joint.axhline(bottom_lim, c='k', lw=2)
# we have to create a secondary y-axis to the joint-plot, otherwise the kde
might be very small compared to the scale of the original y-axis
ax_joint_2 = g.ax_joint.twinx()
sns.kdeplot(temp.x, shade=True, color='red', ax=ax_joint_2, legend=False)
ax_joint_2.spines['right'].set_visible(False)
ax_joint_2.spines['top'].set_visible(False)
ax_joint_2.yaxis.set_visible(False)
# calculating the StdDev of the y-axis band above
S = temp.std()
M = temp.mean()
print("StdDev", S)
print("Mean", M)
And now what I want to do is calculate the mean/std (below again):
S = temp.std()
M = temp.mean()
But do this in a loop to cover the entire range of the "y" variable from 0 to 8. I want to keep these results in a format where I can then later plot them or further manipulate them (list, array, etc).
Upvotes: 0
Views: 442
Reputation: 67
A simple while
loop accomplishes what we want here:
bottom_lim, top_lim = 0, 0.25
g = sns.jointplot(x="x", y="y", data=data)
while bottom_lim < 7.75 and top_lim < 8:
temp = data.loc[(data.y>=bottom_lim)&(data.y<top_lim)]
g.ax_joint.axhline(top_lim, c='g', lw=2)
g.ax_joint.axhline(bottom_lim, c='g', lw=2)
ax_joint_2 = g.ax_joint.twinx()
sns.kdeplot(temp.x, shade=True, color='green', ax=ax_joint_2, legend=False)
ax_joint_2.spines['right'].set_visible(False)
ax_joint_2.spines['top'].set_visible(False)
ax_joint_2.yaxis.set_visible(False)
# calculating the StdDev of the band above
S = temp.std()
M = temp.mean()
print("StdDev", S)
print("Mean", M)
bottom_lim+=0.25
top_lim+=0.25
We will have to adjust the limits on the top/bottom to account for missing data since a slice without data throws an error, but when I ran this code for top and bottom limits under 2 it worked beautifully.
But if there is a more elegant way, I'm always looking to reduce/reuse/recycle.
Upvotes: 1