Reputation: 43
I am trying to replace the values in data frame with frequency.
Here is my data:
blah<-list(c(1,1,2,2,3,1,3,2,2,5,5), c(7,8,7,8,9,9,7,8,9,7,7))
blah<-as.data.frame(blah)
colnames(blah)<-c("col1","col2")
I have created a table with two columns.
Next, I use "table" to generate the frequency for both columns:
col1Freq<-table(blah[,1])/dim(blah)[1]
col2Freq<-table(blah[,2])/dim(blah)[1]
My goal is to replace all the values in blah to frequencies. So the final table should be the same size as blah, but I want frequencies instead of integers.
Sorry I don't have any pics to show.... Thanks for your help!!!!
Upvotes: 4
Views: 1025
Reputation: 484
I faced the same problem. In my case, I need such transformation to later calculate product of frequencies for each column, which should result into frequency (probability) of multivariate (multidimensional) data.
My solution works for any number of columns:
apply(blah,2,function(x){
t = as.data.frame(table(x))
t$Freq[match(x,t[,1])]/length(x)
})
Upvotes: 1
Reputation: 162421
If I correctly understand your question, the base R function ave()
(pay no attention to its misleading name) will do what you're looking for.
blah2 <-
transform(blah,
col1Freq = ave(col1, col1, FUN=function(X) length(X)/nrow(blah)),
col2Freq = ave(col2, col2, FUN=function(X) length(X)/nrow(blah)))
blah2[3:4]
# col1Freq col2Freq
# 1 0.2727273 0.4545455
# 2 0.2727273 0.2727273
# 3 0.3636364 0.4545455
# 4 0.3636364 0.2727273
# 5 0.1818182 0.2727273
# 6 0.2727273 0.2727273
# 7 0.1818182 0.4545455
# 8 0.3636364 0.2727273
# 9 0.3636364 0.2727273
# 10 0.1818182 0.4545455
# 11 0.1818182 0.4545455
Upvotes: 4