Reputation: 35
Following a previous question I asked, I got an awesome answer.
Here is a quick summary: I want to compute a multidimensional development index based on South Africa Data for several years. My list is composed of individual information for each year, so basically df1 is about year 1 and df2 about year2.
df1<-data.frame(var1=c(1, 1,1), var2=c(0,0,1), var3=c(1,1,0))
df2<-data.frame(var1=c(1, 0,1), var2=c(1,0,1), var3=c(0,1,0))
mylist <-list (df1,df2)
var1 could be the stance on religion of each person, var2 how she voted in last national election, etc. In my very simple case, I have the data for 3 different persons each year. From there, I compute an index based on a number of variables (not all of them) You can find here a very simplified working index function, with only 2 of 3 variables, named dimX and dimY:
myindex <- function(x, dimX, dimY){
econ_i<- ( x[dimX]+ x[dimY] )
return ( (1/length(econ_i))*sum(econ_i) )
}
myindex(df1, "var2", "var3")
and
myindex2 = function(x, d) {
myindex(x, d[1], d[2])
}
Then I have my dataframe of variables I want to use for my index. I am trying to compute the index for several sets of variables.
args <- data.frame(set1=c("var1", "var2"), set2=c("var2", "var3"), stringsAsFactors = F)
I'd like to have the result as follows : (a)list(set1 = list(df1, df2), set2 = (df1, df2))
instead of (b) list(df1 = list(set1, set2), df2 = list(set1, set2))
.
Case (a) represents a time series, meaning I have a list of results of my indexes each year for only one set of variables. Case (b) is the opposite where I have the index results of one year for every set of variables. Each individual result should be a unique numeric value. Hence, I am expecting to get a list of 2 sublists df1 and df2, each sublist containing 3 numeric values.
I've been adviced to do use that great command:
lapply(mylist, function(m) lapply(args, myindex2, x = m))
It's working great, but I get the result in the "wrong" format, namely the second one (b) I showed. How could I get the results ordered per set (i.e. case (a) as time series) instead of per year?
Thanks a lot for your help!
PJ
EDIT: I've managed to find a solution that doesn't answer the question, but still allows me to get my data in desired order. Namely, I'm transforming my list of lists to a matrix that I simply transpose.
Upvotes: 0
Views: 685
Reputation: 35
If that may provide any help, from this article, here my actual index function:
RCI_a_3det <-function(x, econ1, econ2, econ3, perso1, perso2, perso3, civic1, civic2, civic3){
econ_i<- (1/3) *( x[econ1]+ x[econ2] + x[econ3])
perso_i<- (1/3)*( x[perso1] + x[perso2] + x[perso3])
civic_i<- (1/3)*(x[civic1] + x[civic2] + x[civic3])
daf <- data.frame(econ_i, perso_i, civic_i)
colnames(daf)<- c("econ_i", "perso_i", "civic_i")
df1 <- subset(daf, daf$econ_i !=1 & daf$perso_i !=1 & daf$civic_i!=1 )
sum_xik <- (df1$econ_i + df1$perso_i + df1$civic_i)
return ( 1/(3*nrow(df1)) * sum(sum_xik, na.rm=T))
}
Edit: x is a list of all personal information, for every variable and for every year. It is pretty large. I am using 9 variables to compute this index, but I actually have 30 such variables in my data, so I have set up a dataframe of sets of variables I could use to compute this index. This is the equivalent of my args df in the simple example. I am actually using 200 such combinations.
Upvotes: 1
Reputation: 1058
This answer will be edited!
Currently, your function index()
does this
myindex <- function(x, dimX, dimY){
econ_i<- ( x[dimX]+ x[dimY] )
return ( (1/length(econ_i))*sum(econ_i) )
}
Aren't you after this, however?
myindex <- function(x, dimX, dimY){
econ_i<- ( x[,dimX]+ x[,dimY] )
return ( (1/length(econ_i))*sum(econ_i) )
}
The way you have it right now, length(econ_i)
always returns 1 because econ_i
is a data.frame()
and not a vector. The length of a data.frame()
is always 1, while the length of a vector is the number of elements within it.
Kindly note that here is what the output looks like in R.
df1["var1"]
var1
1 1
2 1
3 1
returns a data.frame()
df1[,"var1"]
[1] 1 1 1
returns a vector.
I will adjust this post to answer your question when you respond. I think it's important to solve this part first.
Upvotes: 1