user15791858
user15791858

Reputation: 185

Updating based on Condition of Previous Occurrence

I have a data frame

  stim1 stim2 Chosen Rejected
1:     2     1      2        1
2:     3     2      2        3
3:     3     1      1        3
4:     2     3      3        2
5:     1     3      1        3

My objective is at each trial to add a column that specifies whether the stimulus was most recently (in previous trials) Chosen or Rejected.

desired outcome

  stim1 stim2 Chosen Rejected     Previous_stim1   Previous_stim2
1:     2     1      2        1        NaN              NaN
2:     3     2      2        3        NaN              Chosen
3:     3     1      1        3        Rejected         Rejected
4:     2     3      3        2        Chosen           Rejected
5:     1     3      1        3        Chosen           Chosen

any help will be greatly appreciated!


UPDATE

TarJae had a really helpful suggestion that helped categorize the piece of the dataframe i shared correctly. I didn't mention that it's really part of a larger data frame and for some reason fairly quickly this method stops classifying correctly

   stim1 stim2 Chosen Rejected Previous_stim1 Previous_stim2
 1:     2     1      2        1           <NA>           <NA>
 2:     3     2      2        3           <NA>         Chosen
 3:     3     1      1        3       Rejected       Rejected
 4:     2     3      3        2         Chosen       Rejected
 5:     1     3      1        3         Chosen         Chosen
 6:     2     1      1        2         Chosen         Chosen
 7:     2     3      2        3         Chosen         Chosen
 8:     3     1      1        3         Chosen         Chosen
 9:     2     1      2        1         Chosen         Chosen

For example, in row 6 stim1==2. Most recently, 2 was rejected (row 4) but the method classified it as chosen.

Any ideas what this happens?

Thank you again for everyones help.


Update 2

Thank you so much for your help. But say I have also a column with the "outcome".

   stim1 stim2 Chosen Rejected outcome Previous_stim 1 Previous_stim 2
1:    15    13     15       13       1            <NA>            <NA>
2:    13    14     14       13       1        Rejected            <NA>
3:    14    15     14       15       1          Chosen          Chosen
4:    14    13     14       13       0          Chosen        Rejected
5:    13    15     13       15       0        Rejected        Rejected
6:    14    15     14       15       1          Chosen        Rejected
7:    15    13     15       13       1        Rejected          Chosen
8:    14    15     14       15       0          Chosen          Chosen

 I want to encode whether it was 
1) most recently  chosen and outcome=1 (can be coded as 1)
2) most recently chosen and outcome=0 (can be coded as 2)
3) most recently rejected and outcome=1 (can be coded as 3)
4) most recently rejected and outcome=0 (can be coded as 4)

is there an easy way to modify the code to make that happen?

Desired output

  stim1 stim2 Chosen Rejected outcome Previous_stim 1 Previous_stim 2 Left_type right_type
1     2     3      2        3       1            <NA>            <NA>       NaN        NaN
2     1     3      3        1       1            <NA>        Rejected       NaN          3
3     2     1      1        2       1          Chosen        Rejected         1          3
4     1     2      1        2       0          Chosen        Rejected         1          3
5     3     1      3        1       1          Chosen          Chosen         1          2

LAST FOLLOW UP

Finally, I would like to add a column checking whether the chosen stimulus in that previous trial (which I am referencing as the most recent rejected trial for the stim in question) is the same as my current alternative stimulus

For example if have

  stim1 stim2 Chosen Rejected     Previous_stim1   Previous_stim2
1:     2     1      2        1        NaN              NaN
2:     3     2      2        3        NaN              Chosen
3:     3     1      1        3        Rejected         Rejected
4:     2     3      3        2        Chosen           Rejected
5:     1     3      1        3        Chosen           Chosen

And here is how I would update my table

in trial 3, previous_stim1 (i.e 3) was previously rejectedin favor of 2 (from trial 2) and not in favor of 1 (which is the current alternative) and so Current_alternative_left=0. 

 Similarly, previous_stim2 (i.e 1)was previously 
rejected but that was rejected in favor of 2 (from trial 1) 
and so current_alternative_right=0
    
    On the other hand, in trial 4 stim1=2 
was previously chosen relative to the same 

stimulus as its currently being pitted against (3) and so current_alternative_right=1

Desired Output

stim1 stim2 Chosen Rejected outcome Previous_stim 1 Previous_stim 2 Left_type right_type
1     2     3      2        3       1            <NA>            <NA>       NaN        NaN
2     1     3      3        1       1            <NA>        Rejected       NaN          3
3     2     1      1        2       1          Chosen        Rejected         1          3
4     1     2      1        2       0          Chosen        Rejected         1          3
5     3     1      3        1       1          Chosen          Chosen         1          2

Current_alternative_left    Current_alternative_right
NaN                           NaN
NaN                           0
0                             0 
1                             0
1                             0     

i am new to data.table but i tried to copy ThomasisCoding function to return this as well with

 h <- function(stim, cr) {
            stim_chosen <- rep(NA,length(stim))
            for (k in seq_along(stim)[-1]) {
                  
                  ind <- which(cr[1:(k - 1), , drop = FALSE] == stim[k], arr.ind = TRUE)
                  if (length(ind)) {
                        stim_chosen[k] <- stim[tail(ind,1)[,"row"]]
                        
                  }
            }
            stim_chosen 
      }


setDT(df)[  ,
                      paste0("Chosen_Last", 1:2) := lapply(
                            .(stim1, stim2),
                            h,
                            cr = cbind(Chosen,Rejected)
                      )
                      ]

though this is not quite giving me the correct answer. Anyone know where i am going wrong?

Upvotes: 4

Views: 144

Answers (3)

ThomasIsCoding
ThomasIsCoding

Reputation: 101099

For Update 2

setDT(df)[
  ,
  paste0("Previous_stim", 1:2) := lapply(
    .(stim1, stim2),
    f,
    cr = cbind(Chosen, Rejected)
  )
][
  ,
  paste0(c("left", "right"), "type") := lapply(.SD, function(x) 2 * (x == "Rejected") + 2 - outcome),
  .SDcols = patterns("Previous")
][]

gives

   stim1 stim2 Chosen Rejected outcome Previous_stim1 Previous_stim2 lefttype
1:     2     1      2        1       1           <NA>           <NA>       NA
2:     3     2      2        3       1           <NA>         Chosen       NA
3:     3     1      1        3       1       Rejected       Rejected        3
4:     2     3      3        2       0         Chosen       Rejected        2
5:     1     3      1        3       0         Chosen         Chosen        2
6:     2     1      1        2       1       Rejected         Chosen        3
7:     2     3      2        3       1       Rejected       Rejected        3
8:     3     1      1        3       0       Rejected         Chosen        4
9:     2     1      2        1       0         Chosen         Chosen        2
   righttype
1:        NA
2:         1
3:         3
4:         4
5:         2
6:         1
7:         3
8:         2
9:         2

Data

> dput(df)
structure(list(stim1 = c(2L, 3L, 3L, 2L, 1L, 2L, 2L, 3L, 2L),
    stim2 = c(1L, 2L, 1L, 3L, 3L, 1L, 3L, 1L, 1L), Chosen = c(2L,
    2L, 1L, 3L, 1L, 1L, 2L, 1L, 2L), Rejected = c(1L, 3L, 3L,
    2L, 3L, 2L, 3L, 3L, 1L), outcome = c(1, 1, 1, 0, 0, 1, 1,
    0, 0)), class = "data.frame", row.names = c(NA, -9L))

As per your update, you can try the following code by defining a custom function f

f <- function(stim, cr) {
  res <- rep(NA, length(stim))
  for (k in seq_along(stim)[-1]) {
    ind <- which(cr[1:(k - 1), , drop = FALSE] == stim[k], arr.ind = TRUE)
    if (length(ind)) {
      res[k] <- colnames(cr)[tail(ind[, "col"][order(ind[, "row"])], 1)]
    }
  }
  res
}

setDT(df)[
  ,
  paste("Previous_stim", 1:2) := lapply(
    .(stim1, stim2),
    f,
    cr = cbind(Chosen, Rejected)
  )
][]

and you will see

> setDT(df)[, paste("Previous_stim",1:2) := lapply(.(stim1,stim2),f, cr = cbind(Chosen, Rejected))][]
   stim1 stim2 Chosen Rejected Previous_stim 1 Previous_stim 2
1:     2     1      2        1            <NA>            <NA>
2:     3     2      2        3            <NA>          Chosen
3:     3     1      1        3        Rejected        Rejected
4:     2     3      3        2          Chosen        Rejected
5:     1     3      1        3          Chosen          Chosen
6:     2     1      1        2        Rejected          Chosen
7:     2     3      2        3        Rejected        Rejected
8:     3     1      1        3        Rejected          Chosen
9:     2     1      2        1          Chosen          Chosen

Upvotes: 2

TarJae
TarJae

Reputation: 78917

Here is a solution that was generated with the help of ThomasIsCoding Check if value of column A is present in the same row or previous rows of column B:Here are also additional answers which are adequate for your solution! You could change and adapt which one fits for you. I chose the first one provided by ThomasIsCoding.

The main task was to check the value in all previous rows of an other column

library(dplyr)
df %>% 
    mutate(x = replace(rep(NA, length(Chosen)), match(stim1, lag(Chosen)) <= seq_along(stim1), "Chosen"),
           y = replace(rep(NA, length(Rejected)), match(stim1, lag(Rejected)) <= seq_along(stim1), "Rejected"),
           a = replace(rep(NA, length(Chosen)), match(stim2, lag(Chosen)) <= seq_along(stim2), "Chosen"),
           b = replace(rep(NA, length(Rejected)), match(stim2, lag(Rejected)) <= seq_along(stim2), "Rejected"),
           Previous_stim1 = coalesce(x, y),
           Previous_stim2 = coalesce(a, b)) %>% 
    select(stim1, stim2, Chosen, Rejected, Previous_stim1, Previous_stim2)
   stim1 stim2 Chosen Rejected Previous_stim1 Previous_stim2
1:     2     1      2        1           <NA>           <NA>
2:     3     2      2        3           <NA>         Chosen
3:     3     1      1        3       Rejected       Rejected
4:     2     3      3        2         Chosen       Rejected
5:     1     3      1        3         Chosen         Chosen

Upvotes: 2

M Daaboul
M Daaboul

Reputation: 224

I think you need to correct the desired outcome in the above table. But it looks like you are looking for the lag verb which can helpfully solve this when used alongside if_else:

library(dplyr)

tbl <- tibble(stim1 = c(2,3,3,2,1), stim2 = c(1,2,1,3,3), 
              chosen = c(2,2,1,3,1), rejected = c(1,3,3,2,3))

tbl %>% 
mutate(Previous_stim1 = if_else(lag(tbl$chosen) == lag(stim1), "Chosen", "Rejected")) %>%
mutate(Previous_stim2 = if_else(lag(tbl$chosen) == lag(stim2), "Chosen", "Rejected")) 

Upvotes: 1

Related Questions