Reputation: 2069
First, this is a very basic question that I'm unsure of how to phrase. If the question is a duplicate (though I checked using what I thought might be appropriate phrasing), I'll obviously retract and appreciate the link.
Second, I am sure there is an easier way to do what I'm trying, but don't want to get off-track.
OK. I'm attempting to just get a table of column proportions from a matrix of 0/1's (the proportion of 1's conditional on a value of another variable, which is PARTY in this case).
my data.frame is m103, and of dimensions (437,91) and the following process works (as in, produces what I want):
prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
But of course, I want to actually keep the output, and this is where the error arises. If I do this:
a <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
Things are great. But IMMEDIATELY after this, if I try:
m103.avg.prop <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
I get the error:
Error in FUN(X[[2L]], ...) : only defined on a data frame with all numeric variables
I'd like to keep a rational naming scheme going in my code (which the second example would continue), but I can't tell if this has something to do with what I've tried to assign the output to, or something else.
Many thanks!
EDIT: Let's see if I can be more explicit
#Data import
m103 <- read.csv("103_members_party.csv", header=T)
#See the first few rows/columns
m103[1:5,1:5]
#Produces this:
ID PARTY X930 X461 X137
1 15245 100 0 0 0
2 15000 100 0 0 0
3 29108 200 0 0 0
4 15001 100 0 0 0
5 29132 100 0 0 0
#Sum and get col percentages by PARTY (sums the 1's when PARTY==100, PARTY==200, etc)
#WITHOUT assigning to anything
prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
#Produces:
PARTY V1
[1,] 1.122515e-05 0.580000465
[2,] 2.245030e-05 0.416619418
[3,] 3.681849e-05 0.003309623
#With assignment to a
a <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
a
#Produces
PARTY V1
[1,] 1.122515e-05 0.580000465
[2,] 2.245030e-05 0.416619418
[3,] 3.681849e-05 0.003309623
#Now, assignment to m103.avg.prop
m103.avg.prop <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
#results in error:
Error in FUN(X[[2L]], ...) :
only defined on a data frame with all numeric variables
Upvotes: 2
Views: 13565
Reputation: 43255
The error you're getting is because you're trying to sum something that isn't a number. Without reproducible code I can't tell you exactly what is going on. But, one of the reasons we ask for a reproducible example is that in the process of making one, you will often discover the problem on your own.
In this case, I assume the data came from somewhere like excel, which is notorious for doing surprising things to data. try looking at str(m103)
and one of the column will be a character vector rather than numeric. faulting that, i would have to see your data.
However, there should be no difference between your assignment to a
and your assignment to m103.avg.prop
. As a side note, I like to avoid numbers in my variable names wherever possible, just to avoid confusing myself!
EDIT: Add runable code:
> m103<-data.frame(ID=c(15245, 15000, 29108, 15001, 29132),PARTY=c(100, 100, 200, 100, 100),X930=c(0, 0, 1, 0, 0),X461=c(0, 0, 0, 1, 1),X137=c(1, 1, 1, 1, 1))
> m103
ID PARTY X930 X461 X137
1 15245 100 0 0 1
2 15000 100 0 0 1
3 29108 200 1 0 1
4 15001 100 0 1 1
5 29132 100 0 1 1
> prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
PARTY V1
[1,] 0.0009579095 0.7163630
[2,] 0.0019158189 0.2807633
> a <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
> m103.avg.prop <- prop.table(as.matrix(ddply(m103, .(PARTY), sum, na.rm=T)))
> a
PARTY V1
[1,] 0.0009579095 0.7163630
[2,] 0.0019158189 0.2807633
> m103.avg.prop
PARTY V1
[1,] 0.0009579095 0.7163630
[2,] 0.0019158189 0.2807633
>
I still cannot replicate your problem. Like I said above, the output of str(m103)
and the output of str(a)
will be informative. Also, sessionInfo()
. Short of that, I'll stick with my previous guesses...
Upvotes: 2