Reputation: 61
I have a data frame like this :
ID TYPE
100010700239816 932100Y
100010700239816 9DDDDDDY
100010700239822 0000000000011222211121110100000000000000Y
The output should be like this :
ID 9 3 2 1 0 Y D
100010700239816 2 1 1 1 2 2 6
100010700239822 0 0 5 9 11 1 0
So this is just a sample data, there could be different elements inside a single cell and count of all of those elements is what I am looking for.
I have tried :
a <- count(TEST$TYPE)
and
a <- table(TEST$TYPE)
But I am not able to get the counts of the elements for id's, how do I do this?
Upvotes: 2
Views: 136
Reputation: 3200
Another way to go with stringr
df2 <- aggregate(TYPE ~ ID, data=df, FUN=paste, collapse='')
df2[, unique(unlist(strsplit(df2$TYPE, "")))] <- 0
library(stringr)
df2[, unique(unlist(strsplit(df2$TYPE, "")))] <- sapply(colnames(df2[, unique(unlist(strsplit(df2$TYPE, "")))]),
function(i) str_count(df2$TYPE, i))
Upvotes: 0
Reputation: 39174
We can also use the functions from tidyverse
. We can strsplit
the string, unnest
the string and count
the numbers of each string-ID combination. Finally, we can spread
the data frame to get the desired output (dt2
).
dt <- read.table(text = " ID TYPE
100010700239816 932100Y
100010700239816 9DDDDDDY
100010700239822 0000000000011222211121110100000000000000Y",
header = TRUE, stringsAsFactors = FALSE)
library(tidyverse)
dt2 <- dt %>%
mutate(TYPE = strsplit(TYPE, "")) %>%
unnest() %>%
count(ID, TYPE) %>%
spread(TYPE, n, fill = 0) %>%
select(c("ID", "9", "3", "2", "1", "0", "Y", "D"))
Upvotes: 2
Reputation: 887911
We can split the 'TYPE', grouped by 'ID' and dcast
into 'wide' format
library(data.table)
dcast(setDT(df1)[, unlist(strsplit(TYPE, "")) , ID], ID ~V1)
Upvotes: 3