Reputation: 35792
So I have a bunch of hyperlinks on a web page. From past observation I know the probabilities that a user will click on each of these hyperlinks. I can therefore calculate the mean and standard deviation of these probabilities.
I now add a new hyperlink to this page. After a short amount of testing I find that of the 20 users that see this hyperlink, 5 click on it.
Taking into account the known mean and standard deviation of the click-through probabilities on other hyperlinks (this forms a "prior expectation"), how can I efficiently estimate the probability of a user clicking on the new hyperlink?
A naive solution would be to ignore the other probabilities, in which case my estimate is just 5/20 or 0.25 - however this means we are throwing away relevant information, namely our prior expectation of what the click-through probability is.
So I'm looking for a function that looks something like this:
double estimate(double priorMean,
double priorStandardDeviation,
int clicks, int views);
I'd ask that, since I'm more familiar with code than mathematical notation, that any answers use code or pseudocode in preference to math.
Upvotes: 0
Views: 1909
Reputation: 6904
I made this a new answer since it's fundamentally different.
This is based on Chris Bishop, Machine Learning and Pattern Recognition, Chapter 2 "Probability Distributions" p71++ and http://en.wikipedia.org/wiki/Beta_distribution.
First we fit a beta distribution to the given mean and variance in order to build a distribution over the parametes. Then we return the mode of the distribution which is the expected parameter for a bernoulli variable.
def estimate(prior_mean, prior_variance, clicks, views):
c = ((prior_mean * (1 - prior_mean)) / prior_variance - 1)
a = prior_mean * c
b = (1 - prior_mean) * c
return ((a + clicks) - 1) / (a + b + views - 2)
However, I am quite positive that the prior mean/variance will not work for you since you throw away information about how many samples you have and how good your prior thus is.
Instead: Given a set of (webpage, link_clicked) pairs, you can calculate the number of pages a specific link was clicked on. Let that be m. Let the amount of times that link was not clicked be l.
Now let a be the number of clicks to your new link be a and the number of visits to the site be b. Then your probability of your new link is
def estimate(m, l, a, b):
(m + a) / (m + l + a + b)
Which looks pretty trivial but actually has a valid probabilistic foundation. From the implementation perspective, you can keep m and l globally.
Upvotes: 3
Reputation: 6904
P/N is actually correct from a frequentist perspective.
You could also use a bayesian approach to incorporate prior knowledge, but since you don't seem to have that knowledge, I guess P/N is the way to go.
If you want, you can also use Laplace's rule which iirc comes down to a uniform prior. Just give each link on the page a start of 1 instead of 0. (So if you count the number a link was clicked, give each a +1 bonus and resemble that in your N.)
[UPDATE] Here is a bayesian approach:
Let p(W) be the probability that a person is in a specific group W. Let p(L) be the probability, that a specific link is clicked. then the probability you are looking for is p(L|W). By Bayes' theorem, you can calculate this by
p(L|W) = p(W|L) * p(L) / p(W)
You can estimate p(L) by the amount L was clicked, p(W) by the size of that group with respect to the rest of the users and p(W|L) = p(W and L) / p(L) by the number of persons of the specific group W that clicked L divided by the probability that L is clicked.
Upvotes: 2
Reputation: 26427
You need to know how strongly X is correlated with W.
Most likely you also want to have a more complex mathematical model if you want to develop a big website. If you run a website like digg you have a lot of prior knowledge that you have to factor into your calcualtion. That leads to multivariate statistics.
Upvotes: 0
Reputation: 47944
Bayes' Theorem Proof:
P(A,B) = P( A | B ) * P( B ) (1)
since,
P(A,B) = P(B,A) (2)
And substituting (2) with (1),
P(A | B) * P( B ) = P (B | A) * P(A)
thus (Bayes' Theorem),
P( B | A ) * P(A)
P(A | B) = -----------------
P(B)
P(A) -- prior/marginal probability of A, may or may not take into account B
P(A|B) -- conditional/posterior probability of A, given B.
P(B|A) -- conditional probability of B given A.
P(B) -- prior/marginal probability of B
Consequences,
P( A | B ) = P( A ), then a and b are independent
P( B | A ) = P( B ), and then
and the definition of independence is,
P(A,B) = P(A | B) * P( B ) = P( A )* P( B )
It should be noted, that it is easy to manipulate the probability to your liking by changing the priors and the way the problem is thought of, take a look at this discussion of the Anthropic Principle and Bayes' Theorem.
Upvotes: 0