Mateusz Konopelski
Mateusz Konopelski

Reputation: 1042

Z-score calculation from mean and st dev

I would like to ask whether any popular package like: numpy, scipy, etc has a built in function to calculate Z-Score if I know already crital value, mean and st dev.

I am doing it usually like:

def Zscore(xcritical, mean, stdev):
    return (xcritical - mean)/stdev

#example:
xcritical = 73.06
mean = 72
stdev = 0.5

zscore = Zscore(xcritical, mean, stdev)

and later I am using scipy.stats.norm.cdf to calculate probability of x being lower than xcritical.

import scipy.stats as st
print(st.norm.cdf(zscore))

I wonder If I can simplify it somehow. I know that there is scipy.stats.zscore function but it takes a sample array and not sample statistics.

Upvotes: 7

Views: 10605

Answers (4)

João Vitor Gomes
João Vitor Gomes

Reputation: 363

The life in hours of a battery is known to be approximately normally distributed with standard deviation σ = 1.25 hours. A random sample of 10 batteries has a mean life of x = 40.5 hours. a. Is there evidence to support the claim that battery life exceeds 40 hours? Use α = 0.5.

norm.sf, because I want to know the right side of the bell curve to answer if it exceeds the alpha.

norm.sf(40.5, loc=40, scale=1.25/np.sqrt(10)) #recall the z score formula

p-value = 0.102

Ho: u=40 ; Ha: u>40

So the p-value is greater than alpha, so we fail to reject the null hypothesis and conclude that indeed there is evidence to support that the batery life exceeds 40 hours. P-value of 0.10 > 0.05 alpha

Upvotes: 0

Xavier Guihot
Xavier Guihot

Reputation: 61666

Starting Python 3.9, the standard library provides the zscore function on the NormalDist object as part of the statistics module:

NormalDist(mu=72, sigma=.5).zscore(73.06)
# 2.1200000000000045

Upvotes: 7

Anwarvic
Anwarvic

Reputation: 12992

You can do it in just one line; like so:

>>> import scipy.stats as st
>>> st.norm(mean, stdev).cdf(xcritical))

Upvotes: 1

Harshit Lamba
Harshit Lamba

Reputation: 481

In your question, I am not sure what do you mean by calculating the probability of 'x' being lower than 'xcritical' because you have not defined 'x'. Anyhow, I shall answer how to calculate the z-score for an 'x' value.

Going by the scipy.stats.norm documentation here, there doesn't seem to be an inbuilt method to calculate the z-score for a value ('xcritical' in your case), given the mean and standard deviation. However, you can calculate the same using inbuilt methods cdf and ppf. Consider the following snippet (the values are same as you have used in your post, where 'xcritical' is the value for which you wish to calculate z-score):

xcritical = 73.06
mean = 72
stdev = 0.5

p = norm.cdf(x=xcritical,loc=mean,scale=stdev)
z_score = norm.ppf(p)
print('The z-score for {} corresonding to {} mean and {} std deviation is: {:.3f}'.format(xcritical,mean,stdev,z_score))

Here, we first calculate the cumulative probability 'p' of obtaining 'xcritical' value given 'mean' and 'stdev' using norm.cdf(). norm.cdf() calculates the percentage of area under a normal distribution curve from negative infinity till an 'x' value ('xritical' in this case). Then, we pass this probability to norm.ppf() to obtain the z-score corresponding to that 'x' value. norm.ppf() is percent point function which yields the (z)value corresponding to passed lower tail probability in a standard normal distributed curve. The output of this code 2.12 which is same as what you will obtain from the function Zscore().

Hope that helps!

Upvotes: 1

Related Questions