Reputation: 186
I am trying to build a transitions matrix from Panel data observations in order to obtain the ML estimators of a weighted transitions matrix. A key step is obtaining the individual likelihood function for individuals. Say you have the following data frame:
ID Feature1 Feature2 Transition
120421006 10000 1 ab
120421006 12000 0 ba
120421006 10000 1 ab
123884392 3000 1 ab
123884392 2000 0 ba
908747738 1000 1 ab
The idea is to return, for each agent, the log-likelihood of his path. For agent 120421006, for instance, this boils down to (ignoring the initial term)
LL = log(exp(Yab)/1 + exp(Yab)) + log(exp(Yba) /(1 + exp(Yba))) + log(exp(Yab)/1 + exp(Yab))
i.e,
log(exp(Y_transition)/(1 + exp(Y_transition)))
where Y_transition = xFeature1 + yFeature2 for that transition, and x and y are unknowns.
For example, for individual 120421006, this would boil down to an expression with three elements, since he transitions thrice, and the function would return
LL = log(exp(10000x + 1y)/ 1 + exp(10000x + 1y)) +
log(exp(12000x + 0y)/ 1 + exp(12000x + 0y)) +
log(exp(10000x + 1y)/ 1 + exp(10000x + 1y))
And here's the catch: I need x and y to return as unknowns, since the objective is to obtain a sum over the likelihoods of all individuals in order to pass it to an ML estimator. How would you automate a function that returns this output for all IDs?
Many thanks in advance
Upvotes: 0
Views: 58
Reputation: 79188
Create the function:
fun=function(x){
a=paste0("exp(",x[1],"*x","+",x[2],"*y)")
parse(text=paste("sum(",paste0("log(",a,"/(1+",a,"))"),")"))
}
by(test[2:3],test[,1],fun)
sum(log(exp(c(10000, 12000, 10000) * x + c(1, 0, 1) * y)/(1 +
exp(c(10000, 12000, 10000) * x + c(1, 0, 1) * y))))
--------------------------------------------------------------------
sum(log(exp(c(3000, 2000) * x + c(1, 0) * y)/(1 + exp(c(3000,
2000) * x + c(1, 0) * y))))
--------------------------------------------------------------------
sum(log(exp(1000 * x + 1 * y)/(1 + exp(1000 * x + 1 * y))))
taking an example of x=0
and y=3
we can solve this:
x=0
y=3
sapply(by(test[2:3],test[,1],fun),eval)
[1] -0.79032188 -0.74173453 -0.04858735
in your example above:
x=0
y=3
log(exp(10000*x + 1*y)/ (1 + exp(10000*x + 1*y))) +#There should be paranthesis
log(exp(12000*x + 0*y)/ (1 + exp(12000*x + 0*y))) +
log(exp(10000*x + 1*y)/( 1 + exp(10000*x + 1*y)))
[1] -0.7903219
To get what you need within the comments:
fun1=function(x){
a=paste0("exp(",x[1],"*x","+",x[2],"*y)")
paste("sum(",paste0("log(",a,"/(1+",a,"))"),")")
}
paste(by(test[2:3],test[,1],fun1),collapse = "+")
1] "sum( log(exp(c(10000, 12000, 10000)*x+c(1, 0, 1)*y)/(1+exp(c(10000, 12000, 10000)*x+c(1, 0, 1)*y))) )+sum( log(exp(c(3000, 2000)*x+c(1, 0)*y)/(1+exp(c(3000, 2000)*x+c(1, 0)*y))) )+sum( log(exp(1000*x+1*y)/(1+exp(1000*x+1*y))) )"
But this doesnt make sense why you would group them and then sum all of them. That is same as just summing them without grouping them using the ID which would be simpler and faster
Upvotes: 1
Reputation: 2652
First you have to decide how flexible your function has to be. I am leaving it fairly rigid, but you can alter it at your flavor.
First, you have to input the initial guess parameters, which you will supply in the optimizer. Then, declare your data and variables to be used in your estimation.
Assuming you will always have only 2 variables (you can change it later)
y <- function(initial_param, data, features){
x = initial_param[1]
y = initial_param[2]
F1 = data[, features[1]]
F2 = data[, features[2]]
LL = log(exp(F1 * x + F2 * y) / (1 + exp(F1 * x + F2 * y)))
return(-sum(LL))
}
This function returns the sum of minus the log likelihood, given that most optimizers try to find the parameters at which your function reaches a minimum, by default.
To find your parameters just supply the below function with your likelihood function y
, the initial parameters, data set and a vector with the names of your variables:
nlm(f = y, initial_param = your_starting_guess, data = your_data,
features = c("name_of_first_feature", "name_of_second_feature"), iterlim=1000, hessian=F)
Upvotes: 1