Subhra Sankha Sardar
Subhra Sankha Sardar

Reputation: 61

Count elements in a dataframe in R

I have a data frame like this :

      ID                          TYPE
  100010700239816                932100Y
  100010700239816                9DDDDDDY
  100010700239822                0000000000011222211121110100000000000000Y

The output should be like this :

    ID                9   3   2   1   0   Y   D
 100010700239816      2   1   1   1   2   2   6
 100010700239822      0   0   5   9   11  1   0

So this is just a sample data, there could be different elements inside a single cell and count of all of those elements is what I am looking for.

I have tried :

 a <- count(TEST$TYPE)

and

 a <- table(TEST$TYPE)

But I am not able to get the counts of the elements for id's, how do I do this?

Upvotes: 2

Views: 136

Answers (3)

Patrik_P
Patrik_P

Reputation: 3200

Another way to go with stringr

df2 <- aggregate(TYPE ~  ID, data=df, FUN=paste, collapse='')
df2[, unique(unlist(strsplit(df2$TYPE, "")))] <- 0
library(stringr)
df2[, unique(unlist(strsplit(df2$TYPE, "")))] <- sapply(colnames(df2[, unique(unlist(strsplit(df2$TYPE, "")))]), 
                                                      function(i) str_count(df2$TYPE, i))

Upvotes: 0

www
www

Reputation: 39174

We can also use the functions from tidyverse. We can strsplit the string, unnest the string and count the numbers of each string-ID combination. Finally, we can spread the data frame to get the desired output (dt2).

dt <- read.table(text = "     ID                          TYPE
  100010700239816                932100Y
                 100010700239816                9DDDDDDY
                 100010700239822                0000000000011222211121110100000000000000Y",
                 header = TRUE, stringsAsFactors = FALSE)

library(tidyverse)

dt2 <- dt %>%
  mutate(TYPE = strsplit(TYPE, "")) %>%
  unnest() %>%
  count(ID, TYPE) %>%
  spread(TYPE, n, fill = 0) %>%
  select(c("ID", "9", "3", "2", "1", "0", "Y", "D"))

Upvotes: 2

akrun
akrun

Reputation: 887911

We can split the 'TYPE', grouped by 'ID' and dcast into 'wide' format

library(data.table)
dcast(setDT(df1)[, unlist(strsplit(TYPE, "")) , ID], ID ~V1)

Upvotes: 3

Related Questions