Reputation: 1577
Let's say I have the following form of data in a dataframe in R:
Property 1 | Property 2 | ... | Property n
A B R
C A S
D F C
. . .
. . .
. . .
R Z X
where each of the n properties in any cell can assume any of the letters A to Z. Now, what I would like is to calculate for each row the number of times any of the 26 letters appeared in that row and give me that number in a new column next to Property n. So, for example, in the first row among the n properties there are seven times A, six times B, 0 times C, etc. and the code gives me the following table
Property 1 | Property 2 | ... | Property n | A | B | C | ... | Z
A B R 7 6 0 | ... | 2
C A S
D F C
. . .
. . .
. . .
R Z X
Is there a function in R that does that? Despite of it being slow I thought that I could write some loop over each one of the letters and and row in the form of
x <- vector(length=nrow(tr))
for (i in 1:nrow(tr)) {
x[i] <- count(tr[i,], vars="A")
}
But then I get the error
Error in unique.default(x) :
unique() can only be applied to vectors
or even worse, if "A" is not even once among the n properties I get the error
Error in eval(expr, envir, enclos) : object 'A' not found
What is a possible solution here?
Upvotes: 1
Views: 43
Reputation: 14360
You could use an lapply
with rowSums
to do this rather quickly. I generated some fake data using only three "Properties".
set.seed(1)
df <- data.frame(Property1 = sample(LETTERS, 6), Property2 = sample(LETTERS, 6), Property3 = sample(LETTERS, 6))
df[,LETTERS] <- lapply(LETTERS, function(x) rowSums(df==x))
A snippet of the result looks like:
df[,c(1:6)]
Property1 Property2 Property3 A B C
1 J G M 0 0 0
2 T J O 0 0 0
3 W A L 1 0 0
4 E I E 0 0 0
5 O T S 0 0 0
6 C H Y 0 0 1
Upvotes: 2