Reputation: 625

Calculate percent change from a baseline year (t0) to a subsequent BUT LIMITED series of years (t1, ..., tk)

Imagine you have yearly data for some sort of expenses. You are interested in the percent difference between the first value (t0) and each subsequent value (t1, ... -> tx) BUT only for a specific group of observations, i.e. with the next group, a new series of subsequent years starts.

Example:

    value <- c(10225,10287,10225,10087,10344,10387,10387,14567,13992,15432)
    case <- c(A,A,A,B,B,B,B,B,C,C)

    year    value   case   change
    1989    10225   A      0.00
    1990    10287   A      0.61 # ((100/10225)*10287)-100
    1991    10262   A      0.36
    1995    10087   B      0.00
    1996    10344   B      2.55 # ((100/10087)*10344)-100
    1997    10387   B      2.97 
    1978    10387   B      2.97
    1979    14567   B      ...
    1980    13992   C
    1981    15432   C

How can I calculate the percent change in R?

The answers to my earlier post and similar posts (e.g., this post on calculating relative difference) were very helpful. Thanks again!

However, I had to realize that my case is more complex and edited my question accordingly. The problem is that I do not have ONE series of subsequent years but A NUMBER of limited series of subsequent years, one per group of cases.

Any ideas are highly appreciated!

Many thanks.

Upvotes: 3

Answers (3)

Jilber Urbina

Reputation: 61164

What about this?

((value[-1]/value[1])-1)*100
[1]  0.6063570  0.0000000 -1.3496333  1.1638142  1.5843521  0.7334963

Another alternative

((value - value[1]) / value[1]) * 100
[1]  0.0000000  0.6063570  0.0000000 -1.3496333  1.1638142  1.5843521  0.7334963

For your updated question, here's two R base solutions:

transform(df, Change = unlist(sapply(split(value, case), function(x) ((x - x[1]) / x[1]) * 100)))
   value case    Change
A1 10225    A  0.000000
A2 10287    A  0.606357
A3 10225    A  0.000000
B1 10087    B  0.000000
B2 10344    B  2.547834
B3 10387    B  2.974125
B4 10387    B  2.974125
B5 14567    B 44.413602
C1 13992    C  0.000000
C2 15432    C 10.291595

 transform(df, Change = unlist(aggregate(value ~ case, function(x) ((x - x[1]) / x[1]) * 100, data=df)$value))
   value case    Change
01 10225    A  0.000000
02 10287    A  0.606357
03 10225    A  0.000000
11 10087    B  0.000000
12 10344    B  2.547834
13 10387    B  2.974125
14 10387    B  2.974125
15 14567    B 44.413602
21 13992    C  0.000000
22 15432    C 10.291595

Upvotes: 5

Paul Hiemstra

Reputation: 60944

To answer your expanded question, use transform combined with ddply from the plyr package:

ddply(df, .(case), transform, change = ((100 / value[1]) * value) - 100)

In regard to your comment on the NA and Inf values, this is expected behavior as you are dividing by zero, making the change meaningless. You could delete those entries.

Upvotes: 3

seancarmody

Reputation: 6290

If your data frame is called, say, df, try something like this:

transform(df, change = 100*(value/value[year==1989] - 1))

noting that this will give a value of 0 for 1989 not NA:

#   year value     change
# 1 1989 10225  0.0000000
# 2 1990 10287  0.6063570
# 3 1991 10225  0.0000000
# 4 1992 10087 -1.3496333
# 5 1993 10344  1.1638142
# 6 1994 10387  1.5843521
# 7 1995 10300  0.7334963

If you know you want the first record to be the base you can simply use

transform(df, change = 100*(value/value[1] - 1))

Upvotes: 2

Calculate percent change from a baseline year (t0) to a subsequent BUT LIMITED series of years (t1, ..., tk)

Answers (3)

Related Questions