Reputation: 25
I want to find the average of all scores above and below the median (not including the median), but I have no idea have to go about doing this.
import collections
def main():
names = ["gymnastics_school", "participant_name", "all_around_points_earned"]
Data = collections.namedtuple("Data", names)
data = []
values =[]
with open('state_meet.txt','r') as f:
for line in f:
line = line.strip()
items = line.split(',')
items[2] = float(items[2])
data.append(Data(*items))
values.append(items[2])
print("summary of data:")
sorted_data = sorted (values)
if len(data)%2==0:
a =sorted_data[len(values)//2]
b = sorted_data[len(values)//2-1]
median_val = (a+b)//2
else:
median_val = sorted_data[(len(values)-1)//2]
print(" median score",median_val) #median
Upvotes: 0
Views: 89
Reputation: 1091
Here is an example:
import numpy as np
data_array = np.array(data)
med = np.median(data)
ave_above_med = data_array[data_array > med].mean()
so the function would be:
import numpy as np
def average_above_med(data):
data_array = np.array(data)
med = np.median(data)
response = data_array[data_array > med].mean()
return response
This is a test of that:
test_data = [1, 5, 66, 7, 5]
print(average_above_med(test_data))
which displays:
36.5
Hope this helps.
Upvotes: 0
Reputation: 6789
You can use the build-in function filter
and sum
. For example
above_med = filter(lambda x: x>median_val, values)
print(" average of scores above median ", sum(above_med)/len(above_med))
As suggested by @ChrisP, you can also use the standard package statistics
introduced since python 3.4.
Upvotes: 0
Reputation: 5942
We now have statistics
as part of the standard library:
import statistics
nums = list(range(10))
med = statistics.median(nums)
hi_avg = statistics.mean(i for i in nums if i > med)
lo_avg = statistics.mean(i for i in nums if i < med)
Upvotes: 1