Reputation: 47
I have these columns in a data frame that look like:
combination color_1 color_2
1_1 red red
1_2 red blue
1_3 red green
1_4 red yellow
2_1 blue red
2_2 blue blue
2_3 blue green
2_4 blue yellow
...
Based off matching the color_1 and color_2 values, I would like to be able to create new columns that outputs the result of the match. There are certain specifications to this. For the first row where "red" and "red" are the same, the output in the new column (e.g. "Red-Only") should be a "1", and then "2" for every other match. Then, I would repeat this code but then picking up on matches where "blue" and "blue" occur, to output "1" in a next column (e.g. "Blue-Only") and "2" everywhere else. This goes for Yellow-only matches, Green-only matches, etc. So at the end I would have 4 extra columns depending on the condition.
Thanks for the help in advance!
Upvotes: 2
Views: 1393
Reputation: 76402
Here is a way that doesn't depend on knowing the names of the colors.
fun <- function(color, DF, col1, col2){
2L - (color == DF[[col1]] & color == DF[[col2]])
}
cols1 <- unique(df1$color_1)
cbind(df1, sapply(cols1, fun, df1, 'color_1', 'color_2'))
# combination color_1 color_2 red blue
#1 1_1 red red 1 2
#2 1_2 red blue 2 2
#3 1_3 red green 2 2
#4 1_4 red yellow 2 2
#5 2_1 blue red 2 2
#6 2_2 blue blue 2 1
#7 2_3 blue green 2 2
#8 2_4 blue yellow 2 2
Data.
df1 <- read.table(text = "
combination color_1 color_2
1_1 red red
1_2 red blue
1_3 red green
1_4 red yellow
2_1 blue red
2_2 blue blue
2_3 blue green
2_4 blue yellow
", header = TRUE, stringsAsFactors = FALSE)
Upvotes: 1
Reputation: 11957
Let's start with your existing data:
df <- structure(list(combination = c("1_1", "1_2", "1_3", "1_4", "2_1",
"2_2", "2_3", "2_4"), color_1 = c("red", "red", "red", "red",
"blue", "blue", "blue", "blue"), color_2 = c("red", "blue", "green",
"yellow", "red", "blue", "green", "yellow")), class = "data.frame", row.names = c(NA,
-8L))
combination color_1 color_2
1 1_1 red red
2 1_2 red blue
3 1_3 red green
4 1_4 red yellow
5 2_1 blue red
6 2_2 blue blue
7 2_3 blue green
8 2_4 blue yellow
One solution would be to loop over your four color categories, checking for matches.
colors <- c('red', 'green', 'yellow', 'blue')
matches <- lapply(colors, function(x) {
out <- ifelse(with(df, color_1 == color_2 & color_1 == x), 1, 2)
out
})
And then naming the results of this operation with your intended column names.
names(matches) <- paste(colors, 'only', sep = '_')
And finally, gluing the results together with the original data:
df.new <- cbind(df, as.data.frame(matches))
combination color_1 color_2 red_only green_only yellow_only blue_only
1 1_1 red red 1 2 2 2
2 1_2 red blue 2 2 2 2
3 1_3 red green 2 2 2 2
4 1_4 red yellow 2 2 2 2
5 2_1 blue red 2 2 2 2
6 2_2 blue blue 2 2 2 1
7 2_3 blue green 2 2 2 2
8 2_4 blue yellow 2 2 2 2
Upvotes: 3
Reputation: 2770
You can use ifelse. If you have a lot looping would be a good idea
cols <- data.frame(
color_1=c("Red","Red","Red","Red","Blue","Blue","Blue","Blue"),
color_2=c("Red","Blue","Green","Yellow","Red","Blue","Green","Yellow")
)
cols$redonly <- ifelse( cols$color_1 %in% "Red" & cols$color_2 %in% "Red" , 1 ,2 )
cols$Blueonly <- ifelse( cols$color_1 %in% "Blue" & cols$color_2 %in% "Blue" , 1 ,2 )
cols$greeonly <- ifelse( cols$color_1 %in% "Green" & cols$color_2 %in% "Green" , 1 ,2 )
Upvotes: 2