confusedindividual
confusedindividual

Reputation: 619

Getting group column and column of negative and positive values based on a different data frame

This is a follow up to a previous question I Had:

In R:

I would like to somehow determine if the row of a separate data.frame is mostly negative or positive based on row numbers in a column of a different data.frame... I have included my example data frames (data.frame1 and data.frame2) and the desired output. Also is there a way to get the output with the measurements from an ID column included? See example data frames and desired output below

 >data.frame1
       ID    col_a col_b  col_c
 1   Fish1   1     1       1
 2   Fish1   1    -1      -1 
 3   Fish1   1     1       1 
 4   Fish1  -1    -1      -1 
 5   Fish1   1     1      -1

>data.frame2
   col_a
1    2
2    3
3    5


Example output/result 
    ID   col_a     col_b
1  Fish1    2     -1
2  Fish1    3      1
3  Fish1    5      1

Upvotes: 0

Views: 37

Answers (1)

akrun
akrun

Reputation: 887098

Use the index to subset the first data and then change the 'col_a' values to second data values

out <- df1[df2$col_a, c("ID", "col_a", "col_b")]
out$col_a <- df2$col_a

-output

> out
     ID col_a col_b
2 Fish1     2    -1
3 Fish1     3     1
5 Fish1     5     1

If the intention is to find the value with the highest frequency by row, use Mode

Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}
out <- df1[df2$col_a, ]
out[c('col_a', 'col_b')] <- stack(apply(out[-1], 1, Mode))[2:1]
out$col_c <- NULL
> out
     ID col_a col_b
2 Fish1     2    -1
3 Fish1     3     1
5 Fish1     5     1

data

df1 <- structure(list(ID = c("Fish1", "Fish1", "Fish1", "Fish1", "Fish1"
), col_a = c(1L, 1L, 1L, -1L, 1L), col_b = c(1L, -1L, 1L, -1L, 
1L), col_c = c(1L, -1L, 1L, -1L, -1L)), class = "data.frame", row.names = c("1", 
"2", "3", "4", "5"))

df2 <- structure(list(col_a = c(2L, 3L, 5L)),
 class = "data.frame", row.names = c("1", 
"2", "3"))

Upvotes: 1

Related Questions