Lynn
Lynn

Reputation: 4398

Calculate Growth Increase, grouping by Type (Using R)

I have a dataset, z, that I wish to calculate the growth increase by the type:

location   size     type     date

ny         5        hello    10/01/2020
ny         7        ai       10/02/2020
ny         8        ai       10/03/2020
ny         6        hello    10/04/2020
ca         15       cool     10/05/2020
ca         10       name     10/06/2020
ca         5        name     10/07/2020
ca         16       cool     10/08/2020

Desired output

location  type    increase   percent_increase     start_date    end_date

ca        cool    1           6.67%               10/05/2020    10/08/2020
ca        name    -5         -50%                 10/6/2020     10/7/2020
ny        hello   1           20%                 10/01/2020    10/4/2020
ny        ai      1           14.28%              10/2/2020     10/3/2020

This is what I am doing:

library(tidyverse)
z %>%
group_by(type, location) %>% 
mutate(percent_increase = (size/lead(size) - 1) * 100)

I am not getting my desired output. Any assistance is appreciated.

Upvotes: 1

Views: 459

Answers (2)

Russ Thomas
Russ Thomas

Reputation: 988

To get the results you want, you need a different calculation in your mutate line:

I also added a filter to remove any results with NA for the percent_increase variable.

And finally added ```arrange`` to sort alphabetically by location to match the same order as your requested output.

CODE

z %>% group_by(type, location) %>% mutate( increase = (lead(size) - size), percent_increase = (increase/size) * 100, start_date = date, end_date = lead(date)) %>% filter(!is.na(percent_increase)) %>% arrange(location)

OUTPUT

# A tibble: 4 x 8
# Groups:   type, location [4]
  location  size type  date       increase percent_increase start_date end_date  
  <chr>    <int> <chr> <chr>         <int>            <dbl> <chr>      <chr>     
1 ca          15 cool  10/05/2020        1             6.67 10/05/2020 10/08/2020
2 ca          10 name  10/06/2020       -5           -50    10/06/2020 10/07/2020
3 ny           5 hello 10/01/2020        1            20    10/01/2020 10/04/2020
4 ny           7 ai    10/02/2020        1            14.3  10/02/2020 10/03/2020

INPUT

z <- structure(list(location = c("ny", "ny", "ny", "ny", "ca", "ca", 
"ca", "ca"), size = c(5L, 7L, 8L, 6L, 15L, 10L, 5L, 16L), type = c("hello", 
"ai", "ai", "hello", "cool", "name", "name", "cool"), date = c("10/01/2020", 
"10/02/2020", "10/03/2020", "10/04/2020", "10/05/2020", "10/06/2020", 
"10/07/2020", "10/08/2020")), class = "data.frame", row.names = c(NA, 
-8L))

Upvotes: 3

user14567880
user14567880

Reputation:

you're missing arrange function to organize by date

like this:

z %>%
group_by(type, location) %>% 
arrange(date) %>%
mutate(percent_increase = (size/lead(size) - 1) * 100)

Upvotes: 2

Related Questions