swagbag
swagbag

Reputation: 105

How to make a count variable for a certain criteria separated into two subgroups?

So I have a data set with a column of ID Numbers, column of yearmons, and another column with H, D or empty. I want to sum the H and D's for a specific id for each month. Is this possible?

Lets say I have these 3 columns, this is what I would want count to equal. Within each unique ID, I want to tally the H and D in a certain YearMon.

ID:    YearMon:    Naturaleza: Count:
1      012007      H            1
1      012007      D            2
1      022007                   NA
1      032007      H            1
2      012007      H            1
2      022007                   Na
2      022007      D            1
2      032007                   NA

Upvotes: 0

Views: 55

Answers (1)

akrun
akrun

Reputation: 887851

We can try with data.table. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'ID', 'YearMon', we create the sequence as a column 'Count' after specifying the logical condition for 'i' (Naturaleza!='').

library(data.table)
setDT(df1)[Naturaleza != '', Count := 1:.N , by = .(ID, YearMon)]
df1
#   ID YearMon Naturaleza Count
#1:  1   12007          H     1
#2:  1   12007          D     2
#3:  1   22007               NA
#4:  1   32007          H     1
#5:  2   12007          H     1
#6:  2   22007               NA
#7:  2   22007          D     1
#8:  2   32007               NA

data

df1 <- structure(list(ID = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L),
YearMon = c(12007L, 
 12007L, 22007L, 32007L, 12007L, 22007L, 22007L, 32007L), 
Naturaleza = c("H", 
"D", "", "H", "H", "", "D", "")), .Names = c("ID", "YearMon", 
"Naturaleza"), row.names = c(NA, -8L), class = "data.frame")

Upvotes: 2

Related Questions