Reputation: 596
I have a dataframe that looks like this
Sample_No Lab_ID
1234 2
1235 2
1236 2
2344 3
3425 4
2341 5
6756 5
...
I want to count how many times each lab_id occurs but the occurances number next to it in a new column of the dataframe to look something like the following
Sample_No Lab_ID Occurrence
1234 2 1
1235 2 2
1236 2 3
2344 3 1
3425 4 1
2341 5 1
6756 5 2
...
I can get a list of the unique values by using
I could do something like
table(df$LAB_ID)
but that produces a table summarizing the count
any help appreciated
Upvotes: 2
Views: 65
Reputation: 101034
A base R option using sequence
+ rle
transform(
df,
Occurence = sequence(rle(Lab_ID)$lengths)
)
gives
Sample_No Lab_ID Occurence
1 1234 2 1
2 1235 2 2
3 1236 2 3
4 2344 3 1
5 3425 4 1
6 2341 5 1
7 6756 5 2
A data.table
option
> setDT(df)[, Occurence := rleid(Sample_No), Lab_ID][]
Sample_No Lab_ID Occurence
1: 1234 2 1
2: 1235 2 2
3: 1236 2 3
4: 2344 3 1
5: 3425 4 1
6: 2341 5 1
7: 6756 5 2
Upvotes: 1
Reputation: 70623
Here's a solution without loading more than a dozen additional packages using rle
.
> x <- c(2,2,2, 3, 4, 5,5)
>
> cs <- rle(x)
>
> xy <- cs$lengths
>
> out <- mapply(
+ FUN = function(x) seq(from = 1, to = x, by = 1),
+ xy
+ )
>
> data.frame(
+ lab_id = x,
+ occurrence = unlist(out)
+ )
lab_id occurrence
1 2 1
2 2 2
3 2 3
4 3 1
5 4 1
6 5 1
7 5 2
Upvotes: 2
Reputation: 1466
If you want to use the tidyverse
, or in this case dplyr
:
library(tidyverse) # load library
df <- df %>%
group_by(Lab_id) %>% # for every lab ID
summarise(Occurence=n()) # count occurence
Upvotes: 1
Reputation: 73
If you would like to count the occurrence of each Lab_ID
you could either use the {dplyr}
package:
library(dplyr)
df%>%
count(Lab_ID, name = "Occurrence")
Or you could achive the same using the {data.table}
package as follows:
library(data.table)
setDT(df)[, .(Occurrence = .N),
by = Lab_ID]
Upvotes: 1