PrettyClose
PrettyClose

Reputation: 445

Counting the number of elements in a dataframe

I have a dataframe like this :

mydata
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
 [1,] "-"  "-"  "-"  "-"  "+"  "+"  "-"  "-"  "+"  "-"   "0"   "-"   "0"  
 [2,] "-"  "+"  "-"  "+"  "-"  "-"  "-"  "-"  "+"  "+"   "-"   "+"   NA   
 [3,] "+"  

For each row, how to count the number of elements according "-" or "+" or "0" ?

For example, for the first row, we have 8 element for "-" and for the last row we have : 1 for "+" and 0 for "-" and 0 for "0"

I used table(mydata) but I dont get the expected result. Indeed, for the last row its only gave me 1 for "+" (I also want 0 for "-" and 0 for "0")

Upvotes: 2

Views: 2948

Answers (2)

r2evans
r2evans

Reputation: 160407

You can still use table using a trick.

Some sample data:

set.seed(2)
m <- matrix(sample(c('-','+','0'),size=39,replace=TRUE,prob=c(0.45,0.45,0.1)), nrow=3)
m
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13]
# [1,] "+"  "+"  "+"  "-"  "-"  "-"  "+"  "+"  "+"  "+"   "+"   "-"   "-"  
# [2,] "-"  "0"  "-"  "-"  "+"  "0"  "+"  "-"  "-"  "0"   "+"   "-"   "+"  
# [3,] "-"  "0"  "-"  "+"  "+"  "+"  "-"  "+"  "+"  "+"   "-"   "-"   "-"  

The trick is to add all values, then subtract 1 from the table:

apply(m, 1, function(a) table(c('-','+','0',a))-1L)
#   [,1] [,2] [,3]
# -    5    6    6
# +    8    4    6
# 0    0    3    1

Since it is transposed, some prefer it to remain row-wise relevant:

t(apply(m, 1, function(a) table(c('-','+','0',a))-1))
#      - + 0
# [1,] 5 8 0
# [2,] 6 4 3
# [3,] 6 6 1

NB: apply will return a matrix if and only if all rows return the same sized object. In this case, since we know all possible input values, then with our table trick assures us that we will always have integer vectors of length 3. If there is something else, then it will be returned as a ragged list.

In a special case, if you also want to know the number of NAs, you also need to tell table to include them in the totals:

t(apply(m, 1, function(a) table(c('-','+','0',a,NA),useNA='always')-1L))
#      - + 0 <NA>
# [1,] 5 8 0    0
# [2,] 6 4 3    0
# [3,] 6 6 1    0
m[1,2] <- NA
t(apply(m, 1, function(a) table(c('-','+','0',a,NA),useNA='always')-1L))
#      - + 0 <NA>
# [1,] 5 7 0    1
# [2,] 6 4 3    0
# [3,] 6 6 1    0

(The order of adding the "known values" is not important, as you can see here.)

Upvotes: 2

PrettyClose
PrettyClose

Reputation: 445

I found solution. I just have to do (for example the first row) :

count(mydata[1,][mydata[1,]=="+"])

which give 3.

Its the same thing for "-" or "0", just remplace in the code "+" by "-" or "0" to get the result

Upvotes: 0

Related Questions