Reputation: 51
I have a data frame with 4 columns, each column represent a different treatment. Each column is fill with protein numbers on it and the columns have different number of rows between each other. Theres a way to compare all 4 columns and have as a result a fifth column saying if a value is found in which of the columns? I know I have some values that will happen in two or even maybe 3 of the colums and I was wondering if theres a way to get this as end result in a new column.
I tried Data$A %in% Data$B
but this just gives me TRUE
or FALSE
between two columns. I was looking for some option like match or even contain, but all options seens that can only give me a true or false answer.
What I need is something like this.
A B C
1 DSFG DSFG DSGG
2 DDEG DDED DDEE
3 HUGO HUGI HUGO
So if this is my table, I want the result like this
D(?) E
1 DSFG A,B
2 DSGG C
4 DDEG A
5 DDED B
6 DDEE C
7 HUGO A,C
8 HUGI B
Upvotes: 1
Views: 65
Reputation: 51582
An idea via base R is to use stack
to convert to long, and aggregate
to get the required output.
aggregate(ind ~ values, stack(df), toString)
# values ind
#1 DDED B
#2 DDEE C
#3 DDEG A
#4 DSFG A, B
#5 DSGG C
#6 HUGI B
#7 HUGO A, C
NOTE: Your columns need to be as.character
for this to work. (df[] <- lapply(df, as.character)
)
Stacking turns data into "long format":
stack(df)
values ind
1 DSFG A
2 DDEG A
3 HUGO A
4 DSFG B
5 DDED B
6 HUGI B
7 DSGG C
8 DDEE C
9 HUGO C
toString()
simply joins elements in a vector by comma
toString(c("A", "B", "C"))
[1] "A, B, C"
Aggregating returns a vector of "ind"s for each value, and these are then turned into a string using the function above:
aggregate(ind ~ values, stack(df), FUN=toString)
Upvotes: 3
Reputation: 3842
Doing it the tidy way:
df <- data.frame(A = c("DSFG", "DDEG", "HUGO"), B = c("DSFG", "DDED", "HUGI"), C = c("DSGG", "DDEE", "HUGO"))
library(tidyverse)
df %>%
gather("Column", "Value", 1:3) %>%
group_by(Value) %>%
summarise(Cols = paste(Column, collapse = ","))
Value Cols
DDED B
DDEE C
DDEG A
DSFG A,B
DSGG C
HUGI B
HUGO A,C
Upvotes: 0