Retain rows with blank values in one particular column and remove rows with blank values in all other columns

Question

I have a data frame consisting of participants' scores in a wide format, with each column representing one questionnaire item and each row representing the responses of one participant.

My data would look something like the following:

id <- c("001", "002", "003", "004", "005")
q1 <- c(1, 2, "", 3, 5)
q2 <- c("", 2, 3, 1, "")
q3 <- c(1, 3, 2, 5, 2)
q4 <- c(1, 2, 3, "", 4)
q5 <- c(1, 2, 3, 4, 3)
df <- data.frame(id, q1, q2, q3, q4, q5)
df[, 2:6] <- apply(df[, 2:6], 2, as.numeric)

I would like to retain rows of participants' data when they leave a particular column, let's say q4, blank. Concurrently, I would like to exclude a row as long as any item other than q4 is left blank.

The end result I have in mind is like the screenshot below. Participant 001 and 005 are removed because they have missing q2 scores. Participant 003 is removed because of missing q1 scores. Even though participant 004 has a missing q4 score, I still want to retain participant 004 in the data frame.

If possible, could someone provide me with an answer that could be used in a dplyr pipe? Thanks and much appreciated!

nghauran · Accepted Answer

Try drop_na() from tidyr

library(tidyr)
df %>% drop_na(-q4)

# output
   id q1 q2 q3 q4 q5
#2 002  2  2  3  2  2
#4 004  3  1  5 NA  4

Retain rows with blank values in one particular column and remove rows with blank values in all other columns

Answers (2)

Related Questions