Wendy
Wendy

Reputation: 43

R replace value with frequency

I am trying to replace the values in data frame with frequency.

Here is my data:

blah<-list(c(1,1,2,2,3,1,3,2,2,5,5), c(7,8,7,8,9,9,7,8,9,7,7))
blah<-as.data.frame(blah)
colnames(blah)<-c("col1","col2")

I have created a table with two columns.

Next, I use "table" to generate the frequency for both columns:

col1Freq<-table(blah[,1])/dim(blah)[1]
col2Freq<-table(blah[,2])/dim(blah)[1]

My goal is to replace all the values in blah to frequencies. So the final table should be the same size as blah, but I want frequencies instead of integers.

Sorry I don't have any pics to show.... Thanks for your help!!!!

Upvotes: 4

Views: 1025

Answers (2)

Andrey Sapegin
Andrey Sapegin

Reputation: 484

I faced the same problem. In my case, I need such transformation to later calculate product of frequencies for each column, which should result into frequency (probability) of multivariate (multidimensional) data.

My solution works for any number of columns:

apply(blah,2,function(x){
 t = as.data.frame(table(x))
 t$Freq[match(x,t[,1])]/length(x)
})

Upvotes: 1

Josh O&#39;Brien
Josh O&#39;Brien

Reputation: 162421

If I correctly understand your question, the base R function ave() (pay no attention to its misleading name) will do what you're looking for.

blah2 <- 
transform(blah,
          col1Freq = ave(col1, col1, FUN=function(X) length(X)/nrow(blah)),
          col2Freq = ave(col2, col2, FUN=function(X) length(X)/nrow(blah)))

blah2[3:4]
#     col1Freq  col2Freq
# 1  0.2727273 0.4545455
# 2  0.2727273 0.2727273
# 3  0.3636364 0.4545455
# 4  0.3636364 0.2727273
# 5  0.1818182 0.2727273
# 6  0.2727273 0.2727273
# 7  0.1818182 0.4545455
# 8  0.3636364 0.2727273
# 9  0.3636364 0.2727273
# 10 0.1818182 0.4545455
# 11 0.1818182 0.4545455

Upvotes: 4

Related Questions