Reputation: 135
I have a dataframe x like this
Id Group Var1
001 A yes
002 A no
003 A yes
004 B no
005 B yes
006 C no
I want to create a data frame like this
Group yes no
A 2 1
B 1 1
C 0 1
The function .aggregate works well
aggregate(x$Var1 ~ x$Group,FUN=summary)
but I am not able to create a dataframe with the results.
If I try using .ddply
ddply(x,"Group",function(x) summary(x$Var1))
I obtain the error: Results do not have equal lengths.
What am I doing wrong?
Thanks.
Upvotes: 1
Views: 4207
Reputation: 121608
I introduce an NA in your data
dat <- read.table(text = 'Id Group Var1
001 A yes
002 A no
003 A NA ## here!
004 B no
005 B yes
006 C no',head = T)
You need to remove NA before summary , because summary create a column for NA and aggregate
formula method has a default setting of na.action = na.omit
which would exclude the extra NA' column. Here a workaround, I remove the NA before the summary:
library(plyr)
ddply(dat,"Group",function(x) {
x <- na.omit(x$Var1)
y <- summary(x)
})
Group no yes
1 A 1 1
2 B 1 1
3 C 1 0
which is equiavlent to
x <- dat
aggregate(x$Var1 ~ x$Group,FUN=summary)
x$Group x$Var1.no x$Var1.yes
1 A 1 1
2 B 1 1
3 C 1 0
Upvotes: 3
Reputation: 193667
This doesn't answer your question about ddply
, but it should help you with your aggregate
output.The second column in the aggregate command that you used is a matrix, but you can wrap the whole output in a do.call(data.frame...
statement to get a data frame instead. Assuming your data.frame
is called "mydf":
temp <- do.call(data.frame, aggregate(Var1 ~ Group, mydf, summary))
temp
# Group Var1.no Var1.yes
# 1 A 1 2
# 2 B 1 1
# 3 C 1 0
str(temp)
# 'data.frame': 3 obs. of 3 variables:
# $ Group : Factor w/ 3 levels "A","B","C": 1 2 3
# $ Var1.no : int 1 1 1
# $ Var1.yes: int 2 1 0
Alternatively, you might look at table
:
table(mydf$Group, mydf$Var1)
#
# no yes
# A 1 2
# B 1 1
# C 1 0
as.data.frame.matrix(table(mydf$Group, mydf$Var1))
# no yes
# A 1 2
# B 1 1
# C 1 0
Upvotes: 4