Reputation: 85
I would like to be able to create a function that would be able to tally up the number of values in columns L2, L3, and L4 that are greater than 0 as a function of some name.
Name L1 L2 L3 L4
Carl 1 1 0 2
Carl 0 1 4 1
Joe 3 0 3 1
Joe 2 2 1 0
For example, someFunction(Carl) = 5 and someFunction(Joe) = 4
I do not want to sum up the values, for example someFunction(Joe) = 7 is incorrect. I hope this makes sense, I am pretty stuck on this. Thanks!
Upvotes: 1
Views: 1038
Reputation: 2759
I would encourage usage of the tidyverse
style of coding. If you use dplyr
and reshape2
packages, the code is elegant and easy to read:
library(dplyr)
library(reshape2)
df1 %>%
select(-L1) %>%
melt(id=1,na.rm=T) %>%
group_by(Name) %>%
transmute(flag=value>0) %>%
summarize(sum(flag))
# A tibble: 2 × 2
Name `sum(flag)`
<fctr> <int>
1 Carl 5
2 Joe 4
Upvotes: 0
Reputation: 887118
We can try with data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), grouped by 'Name', specify the columns of interest, in .SDcols
, unlist
the Subset of Data.table (.SD
), check whether it is greater than 0, and get the sum
of the logical vector. This is assigned (:=
) to create the 'N' column
library(data.table)
setDT(df1)[, N := sum(unlist(.SD)>0), Name, .SDcols = L2:L4]
df1
# Name L1 L2 L3 L4 N
#1: Carl 1 1 0 2 5
#2: Carl 0 1 4 1 5
#3: Joe 3 0 3 1 4
#4: Joe 2 2 1 0 4
Or another option is
setDT(df1)[, N := sum(unlist(lapply(.SD, `>`, 0))), Name, .SDcols = L2:L4]
Or we can use rowsum/rowSums
combination in base R
rowSums(rowsum(+(df1[3:5]>0), df1$Name))
# Carl Joe
# 5 4
If we need only to do this for a particular 'Name'
setDT(df1)[Name == "Carl"][, sum(unlist(.SD) > 0), .SDcols = L2:L4]
If we need a summarised output, do not assign (:=
)
setDT(df1)[, .(N = sum(unlist(.SD)>0)), Name, .SDcols = L2:L4]
# Name N
#1: Carl 5
#2: Joe 4
Upvotes: 0
Reputation: 1338
Or if you want to have a function:
give_count <- function(dat,name) {
sum(dat[dat$Name == name,3:ncol(dat)]!=0)
}
give_count(data,"Joe")
Upvotes: 1