add summary `n` from one dataframe to another dataframe (tidyverse)

Question

I was wondering if there might be a way to replace the column fpc in DATA2 with corresponding fpc obtained from DATA1?

library(tidyverse)

dat <- read.csv('https://raw.githubusercontent.com/rnorouzian/d/master/su.csv')

## 10000 rows ################
 DATA1 <- dat %>%
    group_by(across(all_of(c("gender", "pre")))) %>%
    summarise(n = n(), .groups = 'drop') %>%
    mutate(fpc = n/sum(n)) %>% 
    right_join(dat)


dat2 <- read.csv('https://raw.githubusercontent.com/rnorouzian/d/master/out.csv')
  
## 200 rows #################
DATA2 <- dat2 %>%
  group_by(across(all_of(c("gender", "pre")))) %>%
  summarise(n = n(), .groups = 'drop') %>%
  mutate(fpc = n/sum(n)) %>% 
  right_join(dat2)

Ronak Shah · Accepted Answer

You can join the dataframe and use coalesce to select fpc from DATA2.

library(dplyr)

result <- DATA2 %>%
           left_join(DATA1 %>% distinct(gender, pre, fpc), 
                     by = c('gender', 'pre')) %>%
           mutate(fpc = coalesce(fpc.y, fpc.x)) %>%
           select(names(DATA2))

nrow(result)
#[1] 200

add summary `n` from one dataframe to another dataframe (tidyverse)

Answers (2)

Related Questions