Reputation: 166
I've got a dataset called data1
with headers year
and count
.
My sample data looks like this:
Year Count
1 2005 3000
2 2006 4000
3 2007 5000
4 2008 6000
I add another column to the data which works out the yearly increase. This is my code:
data1growth <- data1 %>%
mutate(Growth = Count - lag(Count))
I want to be able to add another column called period so that I can get the following output:
Year Count Growth Period
1 2005 3000 NA NA
2 2006 4000 1000 2005-2006
3 2007 5000 1000 2006-2007
4 2008 6000 1000 2007-2008
What code should I add to the mutate function to get the desired output, or am I off the mark completely? Any help is appreciated.
Thanks everyone.
Upvotes: 0
Views: 248
Reputation: 101393
Here is a base R option
transform(df1,
Grouth = c(NA, diff(Count)),
Period = c(NA, paste0(Year[-nrow(df1)], "-", Year[-1]))
)
which gives
Year Count Grouth Period
1 2005 3000 NA <NA>
2 2006 4000 1000 2005-2006
3 2007 5000 1000 2006-2007
4 2008 6000 1000 2007-2008
Upvotes: 0
Reputation: 33488
library(dplyr)
data1 %>%
mutate(
Growth = Count - lag(Count),
period = if_else(
row_number() > 1,
paste0(lag(Year), "-", Year),
NA_character_
)
)
# Year Count Growth period
# 1 2005 3000 NA <NA>
# 2 2006 4000 1000 2005-2006
# 3 2007 5000 1000 2006-2007
# 4 2008 6000 1000 2007-2008
Reproducible data
data1 <- data.frame(
Year = seq(2005L, 2008L, 1L),
Count = seq(3000L, 6000L, 1000L)
)
Upvotes: 1
Reputation: 2541
If you want 'Period' to just be a string, you can just use another mutate:
library(tidyverse)
data1 <- tibble(Year = 2005:2008, Count = c(3000, 4000, 5000, 6000))
data1growth <- data1 %>%
mutate(Growth = Count - lag(Count))
# Period as string
data1growth %>%
mutate(Period = paste0(Year, "-", Year-1))
#> # A tibble: 4 x 4
#> Year Count Growth Period
#> <int> <dbl> <dbl> <chr>
#> 1 2005 3000 NA 2005-2004
#> 2 2006 4000 1000 2006-2005
#> 3 2007 5000 1000 2007-2006
#> 4 2008 6000 1000 2008-2007
# Period as string (don't include NA Growth)
data1growth %>%
mutate(Period = ifelse(is.na(Growth), NA, paste0(Year, "-", Year-1)))
#> # A tibble: 4 x 4
#> Year Count Growth Period
#> <int> <dbl> <dbl> <chr>
#> 1 2005 3000 NA <NA>
#> 2 2006 4000 1000 2006-2005
#> 3 2007 5000 1000 2007-2006
#> 4 2008 6000 1000 2008-2007
Upvotes: 0