Reputation: 13
new to R so sorry if this is a bit broad but I'm not really even sure where to start with an approach to this problem.
I have two dataframes, df1 containing demographic data from certain Census tracts.
AfricanAmerican AsianAmerican Hispanic White
Tract1 0.25 0.25 0.25 0.25
Tract2 0.50 0.10 0.20 0.10
Tract3 0.05 0.10 0.35 0.50
And df2 contains observations polygons with the percentage of each census tract that makes up its area.
Poly1 Poly2 Poly3
Tract1 0.33 0.25 0.00
Tract2 0.33 0.25 0.10
Tract3 0.34 0.50 0.90
What I want to do is get the weighted averages of the demographic data in each observation polygon
AfricanAmerican AsianAmerican Hispanic White
Poly1 0.26 0.15 0.27 0.29
Poly2 0.21 0.14 0.29 0.34
Poly3 0.10 0.10 0.34 0.46
So far I'm thinking I could do something like
sum(df1$AfricanAmerican * df2$Poly1)
Then use a for loop to iterate over all demographic variables for one polygon, then nest that in another for loop to iterate over all polygons, but given that I have hundreds of Census tracts and polygons in my working dataset, I was wondering if there was a better approach?
Upvotes: 0
Views: 55
Reputation: 73802
Use colSums
of the products in mapply
.
t(mapply(function(...) colSums(`*`(...)), list(df1), df2))
# AfricanAmerican AsianAmerican Hispanic White
# [1,] 0.2645 0.1495 0.2675 0.2855
# [2,] 0.2125 0.1375 0.2875 0.3375
# [3,] 0.0950 0.1000 0.3350 0.4600
If you want to round to two digits, just wrap round(..., 2)
around it.
Data:
df1 <- read.table(header=T, text='
AfricanAmerican AsianAmerican Hispanic White
Tract1 0.25 0.25 0.25 0.25
Tract2 0.50 0.10 0.20 0.10
Tract3 0.05 0.10 0.35 0.50
')
df2 <- read.table(header=T, text='
Poly1 Poly2 Poly3
Tract1 0.33 0.25 0.00
Tract2 0.33 0.25 0.10
Tract3 0.34 0.50 0.90
')
Upvotes: 0
Reputation: 8836
library(tidyverse)
df1 <-
tibble(
Trat = paste0("Trat",1:3),
AfricanAmerican = c(.25,.5,.05),
AsianAmerican = c(.25,.1,.1),
Hispanic = c(.25,.2,.35)
)
df2 <-
tibble(
Trat = paste0("Trat",1:3),
Poly1 = c(.33,.33,.34),
Poly2 = c(.25,.25,.5),
Poly3 = c(0,.1,.9)
) %>%
#Pivot df2 making a single column for all Poly values
pivot_longer(cols = -Trat,names_to = "Poly")
df1 %>%
#Join df1 and df2 by Trat
left_join(df2) %>%
#Grouping by Poly
group_by(Poly) %>%
#Sum values across variables AfricanAmerican to Hispanic
summarise(across(AfricanAmerican:Hispanic,function(x)sum(x*value)))
Joining, by = "Trat"
# A tibble: 3 x 4
Poly AfricanAmerican AsianAmerican Hispanic
<chr> <dbl> <dbl> <dbl>
1 Poly1 0.264 0.150 0.268
2 Poly2 0.212 0.138 0.288
3 Poly3 0.095 0.1 0.335
Upvotes: 0