user969113
user969113

Reputation: 2429

Add specific value to a data.frame column by matching a pattern

I have two data.frames:

pattern <- data.frame(pattern = c("A", "B", "C", "D"), val = c(1, 1, 2, 2))

match <- data.frame(match = c("A", "C"))

I want to add to my data.frame pattern another column called new_val and assign "X" to each row where the value for column pattern is in the data.frame match otherwise assign "Y"

is.element(pattern$pattern, match$match)

[1] TRUE FALSE TRUE FALSE

So, the resulting data.frame should look like:

    pattern val new_val
1   A       1   X
2   B       1   Y
3   C       2   X
4   D       2   Y

I achieved to do it with an ugly for-loop but I am sure this can be pretty much done in a one line R command using fancy stuff :-)

Is anyone able to help?

Many thanks!

Upvotes: 4

Views: 4850

Answers (2)

Tyler Rinker
Tyler Rinker

Reputation: 109864

Here's one way (I renamed your match to mat since there's a pretty important base function named match that you could actually use to solve this problem; in fact %in% is a form of match:

pattern <- data.frame(pattern = c("A", "B", "C", "D"), val = c(1, 1, 2, 2))
mat <- c("A", "C")

pattern$new_val <- "Y"                            #pre allot everything to be Y
pattern$new_val[pattern$pattern %in% mat] <- "X"  #replace any A or C with an X
pattern

PS if you wanted a one liner data.table would likely do it.

If you wanted something a little more complicated you could use a function from a package I'm working on:

library(qdap)

#original problem
pattern$new_val <- text2color(pattern$pattern, list(c("A", "C")), c("X", "Y"))

#extending it
#makes D  a 5
text2color(pattern$pattern, list(c("A", "C"), "D"), c("X", 5, "Y"))

This function really is designed to do something else but if you want to grab the essential parts of it you can look at the source code.

Upvotes: 2

Dason
Dason

Reputation: 61933

I'm only really posting this since Tyler said "if you wanted a one liner data.table would likely do it" and I knew it was definitely possible with a one liner in base. I am also assuming match had been renamed to mat.

  pattern$new_val <- c("Y", "X")[(pattern$pattern %in% mat)+1]
  pattern
#  pattern val new_val
#1       A   1       X
#2       B   1       Y
#3       C   2       X
#4       D   2       Y

pattern$pattern %in% mat is finding which of the elements of pattern are in mat which returns TRUE if it's in mat, FALSE if it's not. Then I add 1 to make it numeric in the range of 1-2 so that it can be used for indexing. Then we use that as an index to the self defined vector c("Y", "X") and since the index we created is always 1 or 2 we're always able to grab an element of interest. So in this case we'll grab "Y" if pattern wasn't in mat and "X" if it was - which is what you wanted.

Upvotes: 3

Related Questions