Spooked
Spooked

Reputation: 596

Consecutively count the occurrence of a condition R dataframe

I have a dataframe that looks like this

Sample_No Lab_ID
1234       2
1235       2
1236       2
2344       3
3425       4
2341       5
6756       5
...

I want to count how many times each lab_id occurs but the occurances number next to it in a new column of the dataframe to look something like the following

Sample_No Lab_ID   Occurrence
1234       2           1
1235       2           2
1236       2           3
2344       3           1 
3425       4           1
2341       5           1 
6756       5           2
...

I can get a list of the unique values by using

I could do something like

table(df$LAB_ID)

but that produces a table summarizing the count

any help appreciated

Upvotes: 2

Views: 65

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 101034

A base R option using sequence + rle

transform(
  df,
  Occurence = sequence(rle(Lab_ID)$lengths)
)

gives

  Sample_No Lab_ID Occurence
1      1234      2         1
2      1235      2         2
3      1236      2         3
4      2344      3         1
5      3425      4         1
6      2341      5         1
7      6756      5         2

A data.table option

> setDT(df)[, Occurence := rleid(Sample_No), Lab_ID][]
   Sample_No Lab_ID Occurence
1:      1234      2         1
2:      1235      2         2
3:      1236      2         3
4:      2344      3         1
5:      3425      4         1
6:      2341      5         1
7:      6756      5         2

Upvotes: 1

Roman Luštrik
Roman Luštrik

Reputation: 70623

Here's a solution without loading more than a dozen additional packages using rle.

> x <- c(2,2,2, 3, 4, 5,5)
> 
> cs <- rle(x)
> 
> xy <- cs$lengths
> 
> out <- mapply(
+   FUN = function(x) seq(from = 1, to = x, by = 1),
+   xy
+ )
> 
> data.frame(
+   lab_id = x,
+   occurrence = unlist(out)
+ )
  lab_id occurrence
1      2          1
2      2          2
3      2          3
4      3          1
5      4          1
6      5          1
7      5          2

Upvotes: 2

Sandwichnick
Sandwichnick

Reputation: 1466

If you want to use the tidyverse, or in this case dplyr:

library(tidyverse) # load library

df <- df %>%
  group_by(Lab_id) %>% # for every lab ID
  summarise(Occurence=n()) # count occurence

Upvotes: 1

Joshua Entrop
Joshua Entrop

Reputation: 73

If you would like to count the occurrence of each Lab_ID you could either use the {dplyr} package:

library(dplyr)

df%>% 
  count(Lab_ID, name = "Occurrence")

Or you could achive the same using the {data.table} package as follows:

library(data.table)

setDT(df)[, .(Occurrence = .N),
              by = Lab_ID]

Upvotes: 1

Related Questions