Reputation: 650
I am new to learning R and have data table like below
obj= c('A','A','A','A','A','A','A','A','A')
x=c('Poor','Good','Good','Good','Poor','Poor','Poor','Good','Poor')
y=c('High','Low','Low','Low','Low','Low','Low','Low','High')
df = data.table(obj,x,y)
df
obj x y
1: A Poor High
2: A Good Low
3: A Good Low
4: A Good Low
5: A Poor Low
6: A Poor Low
7: A Poor Low
8: A Good Low
9: A Poor High
I want to count number of row with good, bad from x columns also High, low from y column for a row in obj column.
I was able to do it for column x with below code.
dcast(df, obj ~ x, value.var=c("x"), fun.aggregate = length)
which gives me
obj Good Poor
1: A 4 5
However what I am trying to get is for both x and y columns like below.
obj Good Poor High Low
A 4 5 2 7
Upvotes: 1
Views: 640
Reputation: 66819
From @DavidArenburg's comment-answers, you can do...
df[, Reduce(`c`, lapply(.SD, function(x) as.list(table(x)))), by = obj]
# obj Good Poor High Low
# 1: A 4 5 2 7
Or ...
dcast(melt(df, "obj"),
obj ~ paste(variable, value, sep="."),
fun.aggregate = length)
# obj x.Good x.Poor y.High y.Low
# 1: A 4 5 2 7
Or...
melt(df, "obj")[, .N, keyby=.(obj, variable, value)]
# obj variable value N
# 1: A x Good 4
# 2: A x Poor 5
# 3: A y High 2
# 4: A y Low 7
The last way is better for later analysis, since variable and value can be accessed.
Upvotes: 1
Reputation: 4328
I'm sure to incur some wrath from the data.table
folks here, but here's a tidyverse
solution for anyone who's interested:
df %>%
rowid_to_column() %>%
mutate(value = 1,
value_1 = 1) %>%
spread(key = x, value = value) %>%
spread(key = y, value = value_1) %>%
select(-rowid) %>%
mutate_at(vars(Good:Low), funs(if_else(is.na(.), 0, .))) %>%
group_by(obj) %>%
summarize_all(funs(sum))
Upvotes: 0