subset data across multiple columns

Question

Consider the data here:

X <- 1:4
Ya <- 10:13
Yb <- 2:5
Yc <- c(10,11,6,NA)

df <- data.frame(X, Ya, Yb, Yc)

For each X value, I want to extract the unique Y values (from Ya:Yc)

So I am trying to achieve an output:

# the first number is the X value, then the proceeding numbers are
# the unique Ya:Yc values for each row
# 1, 10, 2
# 2, 11, 3
# 3, 12, 4 , 6
# 4, 13, 5

I have tried using a simple for loop.

output1 <- c(NA,NA,NA,NA)

for(i in 1:4) {
  output1[i] <- c(i,as.numeric(unique(df[i,2:4 ])))
}

Steven Beaupr&#233; · Accepted Answer

Try:

library(dplyr)
library(tidyr)

df %>%
  gather(key, value, -X) %>%
  group_by(X) %>%
  distinct(value) %>%
  spread(key, value)

Which gives:

#Source: local data frame [4 x 4]
#Groups: X [4]
#
#      X    Ya    Yb    Yc
#  (int) (dbl) (dbl) (dbl)
#1     1    10     2    NA
#2     2    11     3    NA
#3     3    12     4     6
#4     4    13     5    NA

subset data across multiple columns

Answers (2)

Related Questions