Reputation: 619
This is a follow up to a previous question I Had:
In R:
I would like to somehow determine if the row of a separate data.frame is mostly negative or positive based on row numbers in a column of a different data.frame... I have included my example data frames (data.frame1 and data.frame2) and the desired output. Also is there a way to get the output with the measurements from an ID column included? See example data frames and desired output below
>data.frame1
ID col_a col_b col_c
1 Fish1 1 1 1
2 Fish1 1 -1 -1
3 Fish1 1 1 1
4 Fish1 -1 -1 -1
5 Fish1 1 1 -1
>data.frame2
col_a
1 2
2 3
3 5
Example output/result
ID col_a col_b
1 Fish1 2 -1
2 Fish1 3 1
3 Fish1 5 1
Upvotes: 0
Views: 37
Reputation: 887098
Use the index to subset the first data and then change the 'col_a' values to second data values
out <- df1[df2$col_a, c("ID", "col_a", "col_b")]
out$col_a <- df2$col_a
-output
> out
ID col_a col_b
2 Fish1 2 -1
3 Fish1 3 1
5 Fish1 5 1
If the intention is to find the value with the highest frequency by row, use Mode
Mode <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
out <- df1[df2$col_a, ]
out[c('col_a', 'col_b')] <- stack(apply(out[-1], 1, Mode))[2:1]
out$col_c <- NULL
> out
ID col_a col_b
2 Fish1 2 -1
3 Fish1 3 1
5 Fish1 5 1
df1 <- structure(list(ID = c("Fish1", "Fish1", "Fish1", "Fish1", "Fish1"
), col_a = c(1L, 1L, 1L, -1L, 1L), col_b = c(1L, -1L, 1L, -1L,
1L), col_c = c(1L, -1L, 1L, -1L, -1L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5"))
df2 <- structure(list(col_a = c(2L, 3L, 5L)),
class = "data.frame", row.names = c("1",
"2", "3"))
Upvotes: 1