Reputation: 1385
I have a quite specific thing to do and do not know how to accomplish that: I have two lists, x and y, of corresponding values (about 10k in each list).
First, I need to bin both lists according to their order in x, in bins with N values in each bin. So I cannot pre-define fixed bin edges, I rather need, e.g., 10 values in each bin.
Then I need to compute the median value of the 10 y values corresponding to each x bin.
In the last step, I have a third list, z, with more values like x (about 100k values), and then check for each value, in which x bin it would fall and add the mean value of the corresponding y bin to it (something like: z + mean[y_m:y_n][where x_m < z < x_n])). Any idea how to do that? Thanks!
Upvotes: 2
Views: 4414
Reputation: 226256
You can order the data using list.sort() and then use slicing to create your bins:
s.sort()
bins = []
for i in range(0, len(s), 10):
bin = s[i: i+10]
bins.append(bin)
To get the median of each bin, average the middle to elements:
medians = []
for bin in bins:
middle = bin[4:6]
median = sum(middle) / float(len(middle))
medians.append(median)
This should get you started. I don't want to deprive you of the joy of finishing the program yourself :-)
Upvotes: 3