Eloise
Eloise

Reputation: 3

How to find the annual evolution rate for each firm in my data table?

So I have a data table of 5000 firms, each firm is assigned a numerical value ("id") which is 1 for the first firm, 2 for the second ...

Here is my table with only the profit variable : |id | year | profit |:----| :----| :----| |1 |2001 |-0.4 |1 |2002 |-0.89 |2 |2001 |1.89 |2 |2002 |2.79

Each firm is expressed twice, one line specifies the data in 2001 and the second in 2002 (the "id" value being the same on both lines because it is the same firm one year apart).

How to calculate the annual rate of change of each firm ("id") between 2001 and 2002 ?

I'm really new to R and I don't see where to start? Separate the 2001 and 2002 data? I did this : years <- sort(unique(group$year))years

And I also found this on the internet but with no success :

library(dplyr)
res <-
  group %>%
  arrange(id,year) %>%
  group_by(id) %>%
  mutate(evol_rate = ("group$year$2002" / lag("group$year$2001") - 1) * 100)  %>%
  ungroup()



 

Thank you very much

Upvotes: 0

Views: 139

Answers (2)

pgcudahy
pgcudahy

Reputation: 1601

If you want to do it without reshaping your data into a wide format you can use

library(tidyverse)

id <- sort(rep(seq(1,250, 1), 2))
year <- rep(seq(2001, 2002, 1), 500)
value <- sample(500:2000, 500)

df <- data.frame(id, year, value)

df %>% head(n = 10)
#>    id year value
#> 1   1 2001  1173
#> 2   1 2002  1648
#> 3   2 2001  1560
#> 4   2 2002  1091
#> 5   3 2001  1736
#> 6   3 2002   667
#> 7   4 2001  1840
#> 8   4 2002  1202
#> 9   5 2001  1597
#> 10  5 2002  1797

new_df <- df %>%
    group_by(id) %>%
    mutate(ROC = ((value / lag(value) - 1) * 100))

new_df %>% head(n = 10)
#> # A tibble: 10 × 4
#> # Groups:   id [5]
#>       id  year value   ROC
#>    <dbl> <dbl> <int> <dbl>
#>  1     1  2001  1173  NA  
#>  2     1  2002  1648  40.5
#>  3     2  2001  1560  NA  
#>  4     2  2002  1091 -30.1
#>  5     3  2001  1736  NA  
#>  6     3  2002   667 -61.6
#>  7     4  2001  1840  NA  
#>  8     4  2002  1202 -34.7
#>  9     5  2001  1597  NA  
#> 10     5  2002  1797  12.5

This groups the data by id and then uses lag to compare the current year to the year prior

Upvotes: 0

mgronas_werring
mgronas_werring

Reputation: 53

From what you've written, I take it that you want to calculate the formula for ROC for the profit values of 2001 and 2002:

ROC=(current_value​/previous_value − 1) ∗ 100

To accomplish this, I suggest tidyr::pivot_wider() which reshapes your dataframe from long to wide format (see: https://r4ds.had.co.nz/tidy-data.html#pivoting).

Code:

require(tidyr)
require(dplyr)

id <- sort(rep(seq(1,250, 1), 2))
year <- rep(seq(2001, 2002, 1), 500)
value <- sample(500:2000, 500)

df <- data.frame(id, year, value)

head(df, 10)
#>    id year value
#> 1   1 2001   856
#> 2   1 2002  1850
#> 3   2 2001  1687
#> 4   2 2002  1902
#> 5   3 2001  1728
#> 6   3 2002  1773
#> 7   4 2001   691
#> 8   4 2002  1691
#> 9   5 2001  1368
#> 10  5 2002   893

df_wide <- df %>%
  pivot_wider(names_from = year,
              names_prefix = "profit_",
              values_from = value,
              values_fn = mean)

res <- df_wide %>%
  mutate(evol_rate = (profit_2002/profit_2001-1)*100) %>%
  round(2)

head(res, 10)
#> # A tibble: 10 x 4
#>       id profit_2001 profit_2002 evol_rate
#>    <dbl>       <dbl>       <dbl>     <dbl>
#>  1     1         856        1850    116.  
#>  2     2        1687        1902     12.7 
#>  3     3        1728        1773      2.6 
#>  4     4         691        1691    145.  
#>  5     5        1368         893    -34.7 
#>  6     6         883         516    -41.6 
#>  7     7        1280        1649     28.8 
#>  8     8        1579        1383    -12.4 
#>  9     9        1907        1626    -14.7 
#> 10    10        1227        1134     -7.58

Upvotes: 1

Related Questions