Reputation: 117
I have a time-series of Sales by Account ID. To calculate average growth, I need to extract the first month with non-zero sales for each ID. Since the account could have been established at different times, I need to dynamically identify when sales > 0 for the first time in the account.
The index to the row would be sufficient for me to pass to a function calculating growth. So I expect the following results by Account ID:
54 - [1]
87 - [4]
95 - [2]
I tried `apply(df$Sales,2,match,x>0)` but this doesn't work.
Any pointers? Alternatively, is there an easier way to compute CAGR with this dataset?
Thanks in advance!
CalendarMonth ID Sales
8/1/2008 54 6692.60274
9/1/2008 54 6476.712329
10/1/2008 54 6692.60274
11/1/2008 54 6476.712329
12/1/2008 54 11098.60822
7/1/2008 87 0
8/1/2008 87 0
9/1/2008 87 0
10/1/2008 87 18617.94155
11/1/2008 87 18017.36279
12/1/2008 87 18617.94155
1/1/2009 87 18617.94155
2/1/2009 87 16816.20527
7/1/2008 95 0
8/1/2008 95 8015.956284
9/1/2008 95 0
10/1/2008 95 8015.956284
11/1/2008 95 6309.447514
12/1/2008 95 6519.762431
1/1/2009 95 6519.762431
Upvotes: 6
Views: 7427
Reputation: 2481
Building up on digEmAll answer, a solution using functional programming (maybe a bit cleaner):
> res3 <- tapply(
1:nrow(df)
, df$ID
, function(Idx) Idx[Position(function(x) df[x, "Sales"] > 0, Idx)]
)
> identical(res3, res2)
[1] TRUE
Upvotes: 1
Reputation: 2651
Would this help:
tapply(df$Sales, df$ID, function(a)head(which(a>0),1))
where df
is your data frame above?
If you want the entire row & not just the index, this might help:
lapply(unique(df$ID),function(a) head(subset(df,ID==a & Sales>0),1))
Upvotes: 9
Reputation: 57220
Here's a possible solution:
res1 <- tapply(df$Sales,INDEX=df$ID,FUN=function(x) which(x > 0)[1])
> res1
54 87 95
1 4 2
Where res
is a numeric vector with :
> names(res)
[1] "54" "87" "95"
If you want to get the indexes of the row in the original data.frame
and not in the subsets, you can do:
res2 <- tapply(1:nrow(df),
INDEX=df$ID,FUN=function(idxs) idxs[df[idxs,'Sales'] > 0][1])
> res2
54 87 95
1 9 15
Then you can simply use the indexes in res2
, to subset the data.frame
:
df2 <- df[res2,]
> df2
CalendarMonth ID Sales
8/1/2008 54 6692.603
10/1/2008 87 18617.942
8/1/2008 95 8015.956
Upvotes: 3