Reputation: 619
I'm looking to create a column that indicates, for every unique value, the number of times that it appears in a data set. I would like to generate the frequency column in the data frame below:
ID Frequency
111 4
205 2
603 6
111 4
In the original data, 111
appeared 4 times, 205
appeared 2 times, and 603
appeared 6 times, etc.
Upvotes: 0
Views: 566
Reputation: 39613
Based on @rogues77 comments, and after having given the approach initially, the solution can be:
library(dplyr)
#Code 1
newdf <- df %>% group_by(ID) %>% mutate(N=n())
#Code 2
newdf <- df %>% group_by(ID) %>% summarise(N=n())
Upvotes: 2
Reputation: 887971
We can also use data.table
methods
library(data.table)
setDT(DF)[, Freq := .N, ID]
DF <- structure(list(ID = c(111L, 205L, 603L, 111L)), class = "data.frame",
row.names = c(NA,
-4L))
Upvotes: 1
Reputation: 270378
With the input data DF
shown reproducibly in the Note at the end, use ave
with length
. No packages are used.
nr <- nrow(DF)
transform(DF, Freq = ave(1:nr, ID, FUN = length))
giving:
ID Freq
1 111 2
2 205 1
3 603 1
4 111 2
or with dplyr
library(dplyr)
DF %>%
group_by(ID) %>%
mutate(Freq = n()) %>%
ungroup
Lines <- "ID
111
205
603
111 "
DF <- read.table(text = Lines, header = TRUE)
Upvotes: 2