Reputation: 67
I have a need to subset data by rows based on a "Flag" variable. That is, in the example below, if flag == 1 followed by three or whatever 0s, then summarize M1 and M2 for the three rows where flag == 0. I expect that within G1 + G2 that the number of such events will vary. For each, a summary is needed.
Can such subsetting and summarizing be completed by functions such as aggregate or it's variants or must this be coded with loops, explicitly indexing each element? Any hints would be most appreciate.
G1 G2 G3 Flag M1 M2
10 1 0 0 0 5336.682
10 1 0 1 1 1871.782
10 1 0 0 0 3330.898
10 1 0 0 0 763.134
10 1 0 0 1 1183.485
10 1 0 0 1 385.664
10 1 0 0 1 372.036
10 1 0 1 1 329.601
10 1 1 1 0 281.965
10 1 1 0 0 377.866
10 1 1 0 0 328.342
10 1 1 0 0 512.528
10 1 1 1 0 777.216
10 1 1 0 0 409.559
10 1 1 1 0 417.606
10 1 1 1 0 673.728
10 1 1 0 0 1090.082
10 1 1 0 0 345.481
10 1 1 0 0 329.294
10 2 ... ... ... ...
11 1 ... ... ... ...
... ... ... ... ... ...
11 2 ... ... ... ...
Upvotes: 0
Views: 99
Reputation: 2000
You can use data.table. First, put your data into a dataframe called 'df'. Then, run
dt <- data.table(df)
dt[, group := cumsum(Flag)]
dt[, list(M1 = sum(M1[-1]), M2 = sum(M2[-1])), by = "group"]
You didn't specify which summary you need to sun, so we're just summing M1 and M2.
Upvotes: 2