Reputation: 329
I have a data frame that looks like the following.
ID<-c('001','002','003','004','005')
TYPE<-c('ABB','BCC','AAA','BBA','BCC')
Group<-c('1','2','2','2','1')
df<-data.frame(ID,TYPE,Group)
df
ID TYPE Group
1 001 ABB 1
2 002 BCC 2
3 003 AAA 2
4 004 BBA 2
5 005 BCC 1
I want to get a table to know the frequency of each character in each group and its percentage.
Group
1 2
A 1 4
B 3 3
C 2 2
Total 6 9
And the percentage of it
Group
1 2
A 0.17 0.44
B 0.50 0.33
C 0.33 0.22
Total% 1.00 1.00
I try the following, but it shows error.
str_count(df$TYPE[(df$Group==1], pattern = "A")
str_count(df$TYPE[(df$Group==2], pattern = "A")
str_count(df$TYPE[(df$Group==1], pattern = "B")
str_count(df$TYPE[(df$Group==2], pattern = "B")
str_count(df$TYPE[(df$Group==1], pattern = "C")
str_count(df$TYPE[(df$Group==2], pattern = "C")
Thanks in advance.
Upvotes: 3
Views: 2337
Reputation: 215127
You can use dplyr
and tidyr
:
library(dplyr); library(tidyr)
df %>% group_by(Group) %>% summarise(TYPE = unlist(strsplit(TYPE, ""))) %>%
group_by(Group, TYPE) %>% summarise(Count = n()) %>% spread(Group, Count)
# Source: local data frame [3 x 3]
#
# TYPE 1 2
# (chr) (int) (int)
# 1 A 1 4
# 2 B 3 3
# 3 C 2 2
To get the percentage count:
df %>% group_by(Group) %>% summarise(TYPE = unlist(strsplit(TYPE, ""))) %>%
group_by(Group, TYPE) %>% summarise(Count = n()) %>%
spread(Group, Count) %>% mutate_each(funs(round(./sum(.), 2)), -TYPE)
# Source: local data frame [3 x 3]
#
# TYPE 1 2
# (chr) (dbl) (dbl)
# 1 A 0.17 0.44
# 2 B 0.50 0.33
# 3 C 0.33 0.22
Upvotes: 2
Reputation: 28461
How about in base with stack
and table
:
tbl <- table(stack(`names<-`(strsplit(df$TYPE, ""), df$Group)))
# ind
#values 1 2
# A 1 4
# B 3 3
# C 2 2
Then we can add percentages:
round(prop.table(tbl, 2), 2)
# ind
#values 1 2
# A 0.17 0.44
# B 0.50 0.33
# C 0.33 0.22
If you would like sums:
addmargins(tbl, 1)
Upvotes: 9