OAM
OAM

Reputation: 189

R: Assign string to data frame based on numerical value

I have two data frames.The order of the rows in data2$v1 corresponds with the order of the columns in data1, except the column data1$matched.

 data1 <- data.frame(hellore = c(.05, .8, .9 ), internationality = c(1,.03,1), matched = c("hello", "international", "hero"))

 data2 <- data.frame(v1 = c("hellore", "internationality"))

I need an algorithm which seeks the minimum value (an additional requirement is that the value has to be lower equal 0.05) in each column in data1 and assign the corresponding string in data1$matched to data2$v2. The result should look like this:

data.final <- data.frame(v1 = c("hellore", "internationality"), v2 = c("hello", "international"))

I tried this, but it is not dynamic:

data2$v2 <- NA
 values=data1$matched[which(min(data1[,1]) & (data1[,1] <= 0.05))]
 data2[1,2] <- paste(values)

 values=data1$matched[which(min(data1[,2]) & (data1[,2] <= 0.05))]
 data2[2,2] <- paste(values)

Anyone an idea how to solve this vectorized?

Update

Thx! The solution below works for the example above. Now I have the problem that if no corresponding value exists in data1 then the length of the rows differ and I cannot assign the strings to data2 anymore. See the code and the error message:

 data1 <- data.frame(hellore = c(.05, .8, .9 ), internationality = c(1,.03,1), matched = c("hello", "international", "hero"))

 data2 <- data.frame(v1 = c("hellore", "internationality", "bonbon"))

 idx <- unlist(unname(sapply(data1[-3], function(x) if(min(x) <= 0.05) which.min(x))))
 data2$v2 <- data1$matched[idx]

Error in $<-.data.frame(*tmp*, "v2", value = c(1L, 3L)) :
Replacement has 2 rows, data has 3

Upvotes: 0

Views: 1359

Answers (1)

talat
talat

Reputation: 70266

You could try the following approach (which, however, is not vectorised, since it uses sapply):

idx <- unlist(unname(sapply(data1[-3], function(x) if(min(x) <= 0.05) which.min(x))))
data2$v2 <- data1$matched[idx]

Edit

For the updated example, you can use the following adjusted code:

idx <- unlist(unname(sapply(data1[-3], function(x) if(min(x) <= 0.05) which.min(x))))
data2$v2 <- c(as.character(data1$matched[idx]), rep(NA, nrow(data2) - length(idx)))

Upvotes: 2

Related Questions