syre
syre

Reputation: 982

Combine aggregates at different levels of detail in R

I have a table of employment by city and industry

df <- read.table(text="city industry emp
Washington Auto 2
Washington Aero 2
Boston Auto 4
Boston Aero 2", header = TRUE)

I want to calculate a Relative Diversity Index by city, i.e. the sum for each city, over all industries, of the absolute value of the difference between each industry’s share in local employment and its share in national employment. The math looks like this: RDIc = 1/∑i|Sci-Si| (small letters are subscripts: c for city and i for industry; RDI is the index; S means share).

Using the above data, I should get:

city       rdi
Washington   5
Boston     7.5

Because:

Of course, this is mock data and I have 100s of cities and industries. I haven't been able to do this in R, even in multiple steps, short of splitting the df by city and then reassembling it, which seems very clunky.

Upvotes: 1

Views: 63

Answers (2)

shiny
shiny

Reputation: 3502

This answer is based on the nice answer by @Gregor. I have just tried to reduce the steps so you don't need to create another data.frame and then join it

df1 <- df %>% 
  mutate(ind_total = sum(emp)) %>% 
  group_by(industry) %>% 
  mutate(si = sum(emp) / first(ind_total)) %>% 
  group_by(city) %>% 
  mutate(sci = emp / sum(emp)) %>% 
  summarise(rdi = 1 / sum(abs(sci - si)))

> df1
# A tibble: 2 × 2
        city   rdi
      <fctr> <dbl>
1     Boston   7.5
2 Washington   5.0

Upvotes: 3

Gregor Thomas
Gregor Thomas

Reputation: 145825

Lots of little steps, but this works

library(dplyr)
natl = df %>%
    mutate(ind_total = sum(emp)) %>%
    group_by(industry) %>%
    summarize(si = sum(emp) / first(ind_total)) %>%
    select(industry, si)

result = df %>%
    group_by(city) %>%
    mutate(sci = emp / sum(emp)) %>%
    inner_join(natl) %>%
    group_by(city) %>%
    summarize(rdi = 1 / sum(abs(sci - si)))

result
# # A tibble: 2 × 2
#         city   rdi
#       <fctr> <dbl>
# 1     Boston   7.5
# 2 Washington   5.0

Upvotes: 2

Related Questions