Reputation: 28159
I have a data frame with donations and names of donors.
**donation** **Donor**
25.00 Steve Smith
20.00 Jack Johnson
50.00 Mary Jackson
... ...
I'm trying to do some clustering using the pvclust
package. Unfortunately the package doesn't seem to take non-numerical data.
> rs1.pv1 <- parPvclust(cl, rs1, nboot=10)
Error in cor(x, method = "pearson", use = use.cor) : 'x' must be numeric
I have two questions.
1) Is there another package or method that would do this better?
2) Is there a way to "normalize" the donor names list? Ie get a list of unique donor names, assign each an id number and then insert the id number into the data frame in place of the character name.
Upvotes: 4
Views: 9493
Reputation: 10016
For number 2:
#If donor is a factor then
as.numeric(donor)
#will transform your factor to numeric.
#If it isn't, tranform it to a factor and the to numeric
as.numeric(as.factor(donor))
However, I'm not sure that transforming the donor list to a numeric and then using cor makes sense at all.
HTH
Upvotes: 5
Reputation: 226192
How about rs1 <- transform(rs1, Donor=as.numeric(factor(Donor)))
? (Warning: I haven't thought about what you're doing enough to know whether that makes sense -- so I'm only answering question #2, not question #1). Typically Donor
would already be a factor (this is what e.g. read.table
or read.csv
would do by default), so the factor()
part would be redundant.
Upvotes: 2