Amanda
Amanda

Reputation: 85

tallying up the number of values in data frame in R from multiple columns

I would like to be able to create a function that would be able to tally up the number of values in columns L2, L3, and L4 that are greater than 0 as a function of some name.

Name    L1     L2     L3    L4
Carl    1       1     0     2
Carl    0       1     4     1 
Joe     3       0     3     1
Joe     2       2     1     0

For example, someFunction(Carl) = 5 and someFunction(Joe) = 4

I do not want to sum up the values, for example someFunction(Joe) = 7 is incorrect. I hope this makes sense, I am pretty stuck on this. Thanks!

Upvotes: 1

Views: 1038

Answers (3)

Rahul
Rahul

Reputation: 2759

I would encourage usage of the tidyverse style of coding. If you use dplyr and reshape2 packages, the code is elegant and easy to read:

library(dplyr)
library(reshape2)
df1 %>% 
  select(-L1) %>% 
  melt(id=1,na.rm=T) %>% 
  group_by(Name) %>% 
  transmute(flag=value>0) %>% 
  summarize(sum(flag))


# A tibble: 2 × 2
    Name `sum(flag)`
  <fctr>       <int>
1   Carl           5
2    Joe           4

Upvotes: 0

akrun
akrun

Reputation: 887118

We can try with data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'Name', specify the columns of interest, in .SDcols, unlist the Subset of Data.table (.SD), check whether it is greater than 0, and get the sum of the logical vector. This is assigned (:=) to create the 'N' column

library(data.table)
setDT(df1)[, N := sum(unlist(.SD)>0), Name, .SDcols = L2:L4]
df1
#   Name L1 L2 L3 L4 N
#1: Carl  1  1  0  2 5
#2: Carl  0  1  4  1 5
#3:  Joe  3  0  3  1 4
#4:  Joe  2  2  1  0 4

Or another option is

setDT(df1)[,  N := sum(unlist(lapply(.SD, `>`, 0))), Name, .SDcols = L2:L4]

Or we can use rowsum/rowSums combination in base R

rowSums(rowsum(+(df1[3:5]>0), df1$Name))
#   Carl  Joe 
#   5    4 

If we need only to do this for a particular 'Name'

setDT(df1)[Name == "Carl"][, sum(unlist(.SD) > 0), .SDcols = L2:L4]

Update

If we need a summarised output, do not assign (:=)

setDT(df1)[, .(N = sum(unlist(.SD)>0)), Name, .SDcols = L2:L4]
#   Name N
#1: Carl 5
#2:  Joe 4

Upvotes: 0

count
count

Reputation: 1338

Or if you want to have a function:

give_count <- function(dat,name) {
    sum(dat[dat$Name == name,3:ncol(dat)]!=0)
    }
give_count(data,"Joe")

Upvotes: 1

Related Questions