Reputation: 25
My professor assigned us some homework questions regarding normal distributions. We are using R studio to calculate our values instead of the z-tables.
One question asks about something about meteors where the mean (μ) = 4.35, standard deviation (σ) = 0.59 and we are looking for the probability of x>5. I already figured out the answer with 1-pnorm((5-4.35)/0.59) ~ 0.135.
However, I am currently having some difficulty trying to understand what pnorm calculates.
Originally, I just assumed that z scores were the only arguments needed. So I proceeded to use pnorm(z-score) for most of the normal curvature problems. The help page for pnorm accessed through ?pnorm() indicates that the usage is: pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE).
My professor also says that I am ignoring the mean and sd by just using pnorm(z-score). I feel like it is just easier to type in one value instead of the whole set of arguments. So I experimented and found that 1-pnorm((5-4.35)/0.59) = 1-pnorm(5,4.35,0.59)
So it looks like pnorm(z-score) = pnorm (x,μ,σ).
Is there a reason that using the z-score allows to skip the mean and standard deviation in the pnorm function?
I have also noticed that trying to add μ,σ arguments with the z-score gives the wrong answer (ex: pnorm(z-score,μ,σ).
> 1-pnorm((5-4.35)/0.59)
[1] 0.1352972
> pnorm(5,4.35,0.59)
[1] 0.8647028
> 1-pnorm(5,4.35,0.59)
[1] 0.1352972
> 1-pnorm((5-4.35)/0.59,4.35,0.59)
[1] 1
Upvotes: 0
Views: 4148
Reputation: 6441
That is because a z-score is standard normally distributed, meaning it has μ = 0
and σ = 1
, which, as you found out, are the default parameters for pnorm()
.
The z-score is just the transformation of any normally distributed value to a standard normally distributed one.
So when you output the probability of the z-score for x = 5
you indeed get the same value than asking for the probability of x > 5
in a normal distribution with μ = 4.35
and σ = 0.59
.
But when you add μ = 4.35
and σ = 0.59
to your z-score inside pnorm()
you get it all wrong, because you're looking for a standard normally distributed value in a different distribution.
pnorm()
(to answer your first question) calculates the cumulative density function, which shows you P(X < x)
(the probability that a random variable takes a value equal or less than x
). That's why you do 1 - pnorm(..)
to find out P(X > x).
Upvotes: 1