sharmi
sharmi

Reputation: 5

Creating variable that depends on it lagged value in R

I have a panel dataset with the following columns/variables: Week(t), Custid(i), Activity(i,t), Initial(i) and a host of other variables. I need to create a new variable Experience(i,t) = alpha * Experience(i,t-1) + Activity(i, t-1). The initial value for Experience(i,0)= Initial(i). I am new to R and just about making my transition from SAS to R. How can I create this new variable Experience at the customer level and by week. That is, the value of this variable for the ith customer at week t will depend on the lagged value of the variable in the past week (t-1) plus an Activity made by the customer I in the past week (t-1). Please help, I have a super time crunch to work this problem out. All and any help is sincerely appreciated!

Upvotes: 0

Views: 230

Answers (1)

MrFlick
MrFlick

Reputation: 206197

I'll assume this is your input data

#sample data
set.seed(15)
dd<-data.frame(
     week=rep(1:5, 3), 
     cust=rep(1:3, each=5), 
     activity=runif(15)
)
init<-c(1,2,3)
alpha<-.5

and that looks like

   week cust  activity
1     1    1 0.6021140
2     2    1 0.1950439
3     3    1 0.9664587
4     4    1 0.6509055
5     5    1 0.3670719
6     1    2 0.9888592
7     2    2 0.8151934
8     3    2 0.2539684
9     4    2 0.6872308
10    5    2 0.8314290
11    1    3 0.1046694
12    2    3 0.6461509
13    3    3 0.5090904
14    4    3 0.7066286
15    5    3 0.8623137

Then we calculate Experience with

Experience <- Map(function(i,d) 
    Reduce(function(a,b) 
        {alpha*a + b}, d$activity, i, accumulate=TRUE),
    init, split(dd, dd$cust)
)

we use the outer Map to iterate over the initial values and the subsets of the data for each customer created using split. Then, the inner Reduce implements the lagged algorithm as you've described it.

Then to join it back to the table, we need to remove the initial values form the list and re-stack the values back in order. We can do that with

ExpCol <- unsplit(lapply(Experience, tail, -1), dd$cust)
cbind(dd, ExpCol)

which gives us

   week cust  activity    ExpCol
1     1    1 0.6021140 1.1021140
2     2    1 0.1950439 0.7461009
3     3    1 0.9664587 1.3395092
4     4    1 0.6509055 1.3206601
5     5    1 0.3670719 1.0274020
6     1    2 0.9888592 1.9888592
7     2    2 0.8151934 1.8096230
8     3    2 0.2539684 1.1587799
9     4    2 0.6872308 1.2666208
10    5    2 0.8314290 1.4647394
11    1    3 0.1046694 1.6046694
12    2    3 0.6461509 1.4484856
13    3    3 0.5090904 1.2333332
14    4    3 0.7066286 1.3232952
15    5    3 0.8623137 1.5239612

Upvotes: 1

Related Questions