user1754348
user1754348

Reputation: 81

How to generate counts of positive values by column in data.table

I have some data tables that consist of several columns and many thousands of rows. My data looks something like:

iteration V1  V2  V3  V4
1         -2  3   -4   1
2         -2  3   -3   4
3         -2  3    7   -8
4         -2  3   -4   2
5         -2  3   -4   -3

I have been trying to figure out how to calculate counts of positive values in each column, and the proportion of positive counts to all counts in a column.

This seems fairly simple but I can't figure out how to output a data.table that has counts by column in it.

I can do this by combining a bunch of the following statements, but there has to be a better way- any advice for a tired mind?

nrow(dat[v2>=0])

Upvotes: 0

Views: 1089

Answers (1)

ranlot
ranlot

Reputation: 656

Assuming your dataframe is called df:

df <- data.frame('V1'=c(-2, -2, -2, -2, -2), 'V2'=c(3, 3, 3, 3, 3), 'V3'=c(-4, -3, 7, -4, -4), 'V4'=c(1, 4, -8, 2, -3))

you could start by defining the number of rows as:

nRows <- dim(df)[1]

Then, you can define an auxiliary function as such:

calcStats <- function(x) {
  pos <- sum(df[, x] > 0)
  c("number of positives" = pos, "proportion of positives" = pos / nRows)
  }

and get the result with:

result <- as.data.frame(Map(calcStats, colnames(df)))

                        V1 V2  V3  V4
number of positives      0  5 1.0 3.0
proportion of positives  0  1 0.2 0.6

Upvotes: 1

Related Questions