Reputation: 103
overseas_domestic_indicator ref_year count
<chr> <dbl> <dbl>
1 Domestic 2014 17854
2 Domestic 2015 18371
3 Domestic 2016 18975
4 Domestic 2017 19455
5 Domestic 2018 19819
6 Overseas 2014 6491
7 Overseas 2015 7393
8 Overseas 2016 8594
9 Overseas 2017 9539
10 Overseas 2018 10455
This is my data. I want something like:
ref_year Domestic/Overseas
2014 2.75
2015 ...
... ...
But I don't know how to do this using tidyverse. I tried to use mutate but I don't know how to clarify the count for Domestic and Overseas. Thanks in advance.
Upvotes: 0
Views: 50
Reputation: 887153
We can do a group by 'ref_year' and summarise
by dividing the 'count' corresponding to 'Domestic' with that of 'Overseas' and reshape to 'wide' if needed
library(dplyr)
library(tidyr)
df1 %>%
group_by(ref_year) %>%
summarise(
`Domestic/Overseas` = count[overseas_domestic_indicator == 'Domestic']/
count[overseas_domestic_indicator == 'Overseas'])
# A tibble: 5 x 2
# ref_year `Domestic/Overseas`
# <int> <dbl>
#1 2014 2.75
#2 2015 2.48
#3 2016 2.21
#4 2017 2.04
#5 2018 1.90
Or arrange
first and then do a division
df1 %>%
arrange(ref_year, overseas_domestic_indicator) %>%
group_by(ref_year) %>%
summarise( `Domestic/Overseas` = first(count)/last(count))
Or with dcast
from data.table
library(data.table)
dcast(setDT(df1), ref_year ~ overseas_domestic_indicator)[,
`Domestic/Overseas` := Domestic/Overseas][]
df1 <- structure(list(overseas_domestic_indicator = c("Domestic", "Domestic",
"Domestic", "Domestic", "Domestic", "Overseas", "Overseas", "Overseas",
"Overseas", "Overseas"), ref_year = c(2014L, 2015L, 2016L, 2017L,
2018L, 2014L, 2015L, 2016L, 2017L, 2018L), count = c(17854L,
18371L, 18975L, 19455L, 19819L, 6491L, 7393L, 8594L, 9539L, 10455L
)), class = "data.frame", row.names = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10"))
Upvotes: 2
Reputation: 719
This should work too, multiple ways to do this
df %>%
pivot_wider(overseas_domestic_indicator,
names_from = overseas_domestic_indicator,
values_from = count) %>%
mutate(Ratio = Domestic/Overseas)
Upvotes: 1
Reputation: 388982
You can get the data in wide format first and then divide Domestic
by Overseas
library(dplyr)
df %>%
tidyr::pivot_wider(names_from = overseas_domestic_indicator,
values_from = count) %>%
mutate(ratio = Domestic/Overseas)
# ref_year Domestic Overseas ratio
# <int> <int> <int> <dbl>
#1 2014 17854 6491 2.75
#2 2015 18371 7393 2.48
#3 2016 18975 8594 2.21
#4 2017 19455 9539 2.04
#5 2018 19819 10455 1.90
Upvotes: 2