A Hud
A Hud

Reputation: 95

Return the value out of a group of columns based on another column's location

I'm trying to pull in the value out of a group of columns based on the another column. Walking through the first row for an example: -Take value of CodeToMatch = 1 -Search through columns: Code.1, Code.2, Code.3 to locate where the value of 1 is. In this case, it's in the 3rd column, therefore, return the value of the 3rd column out of pCode.1, pCode.2, pCode3, which is "p4"

The expected_outcome column in my example df below shows what I'm after.

Any help is much appreciated!

c1 <- c("1","2","3")
c2 <- c("8","1","3")
c3 <- c("4","2","4")
c4 <- c("1","3","5")
c5 <- c("p1","p2","p3")
c6 <- c("p8","p1","p3")
c7 <- c("p4","p2","p4")
c8 <- c("p4","p1","p3")
df <- data.frame(c1,c2,c3,c4,c5,c6,c7,c8)
colnames(df)[c(1:8)] <- c("CodeToMatch","Code.1","Code.2","Code.3","pCode.1","pCode.2","pCode.3","expected_output")

Upvotes: 0

Views: 39

Answers (3)

Wimpel
Wimpel

Reputation: 27732

data.table solution

sample data

df <- structure(list(CodeToMatch = structure(1:3, .Label = c("1", "2", 
"3"), class = "factor"), Code.1 = structure(c(3L, 1L, 2L), .Label = c("1", 
"3", "8"), class = "factor"), Code.2 = structure(c(2L, 1L, 2L
), .Label = c("2", "4"), class = "factor"), Code.3 = structure(1:3, .Label = c("1", 
"3", "5"), class = "factor"), pCode.1 = structure(1:3, .Label = c("p1", 
"p2", "p3"), class = "factor"), pCode.2 = structure(c(3L, 1L, 
2L), .Label = c("p1", "p3", "p8"), class = "factor"), pCode.3 = structure(c(2L, 
1L, 2L), .Label = c("p2", "p4"), class = "factor")), class = "data.frame", row.names = c(NA, 
-3L))

code

library(data.table)
#first, melt wide table to long format
df.melt <- melt( setDT(df), id.vars="CodeToMatch", measure.vars = patterns(Code="^Code\\..*", pCode="^pCode.*"))
#now finding everything is easy...
df.melt[ Code == CodeToMatch, .(CodeToMatch, pCode)]

output

#    CodeToMatch pCode
# 1:           3    p3
# 2:           2    p1
# 3:           1    p4

Upvotes: 3

Ronak Shah
Ronak Shah

Reputation: 388797

Separate the code and pCode columns based on the pattern in the names. Find out the index of CodeToMatch which lies in code_columns for every row and extract the corresponding pcode_columns from it using mapply.

code_columns <- grep("^Code\\.[0-9]+", names(df))
pcode_columns <- grep("^pCode", names(df))

mapply(function(x, y) df[x, pcode_columns][df[x, code_columns]==y],
                       1:nrow(df), df$CodeToMatch)

#[1] "p4" "p1" "p3"

Ran

df[1:4] <- lapply(df[1:4], function(x) as.numeric(as.character(x)))

to keep numeric columns as numeric and not factors.

Upvotes: 0

Maurits Evers
Maurits Evers

Reputation: 50668

I've got no idea how well this generalises but here is an option

nCode <- 3
df$expected_output <- apply(df, 1, function(x) x[nCode + 1 + which(x[2:(nCode + 1)] == x[1])])
df$expected_output
#[1] "p4" "p1" "p3"

Note that the number of "code" columns is hard-coded. In your case you have 3 "Code" columns with matching "pCode" columns. Adjust as necessary. This also assumes that the first column always contains the to-be-matched code numbers.

Upvotes: 0

Related Questions