Using R to combine rows in a data frame based on a common variable

Question

I am working with a few survey responses and the data returned has been formatted awkwardly. Here is an example of what the data looks like:

df <- data_frame(Person = c("Person1", "Person1","Person2", "Person2","Person3", "Person3"), Q1 = c(NA, 1, NA, 2, NA, 1), Q2 = c(NA, 3, NA, 2, NA, 4),
             Q3 = c(2, NA, 4, NA, 1, NA), Q4 = c(5, NA, 5, NA, 5, NA))

This is what I am starting with:

Person     Q1    Q2    Q3    Q4
        
1 Person1    NA    NA     2     5
2 Person1     1     3    NA    NA
3 Person2    NA    NA     4     5
4 Person2     2     2    NA    NA
5 Person3    NA    NA     1     5
6 Person3     1     4    NA    NA

This is what I would like:

Person     Q1    Q2    Q3    Q4
        
1 Person1     1     3     2     5
2 Person2     2     2     4     5
3 Person3     1     4     1     5

I would like to be able to accomplish this using dplyr but so far I have not had any luck.

IceCreamToucan · Accepted Answer

You can get the first non-NA for each column within each group with coalesce. No real reason to prefer that over na.omit though unless you have >1 non-NA value.

library(tidyverse)

df %>% 
  group_by(Person) %>% 
  summarise_all(reduce, coalesce)

# # A tibble: 3 x 5
#   Person     Q1    Q2    Q3    Q4
#         
# 1 Person1     1     3     2     5
# 2 Person2     2     2     4     5
# 3 Person3     1     4     1     5

Using R to combine rows in a data frame based on a common variable

Answers (2)

Related Questions