Using case_when and lag() together?

Question

Apologies in advance that this question is massive- I'm trying to include as much detail as possible. The example table below is output from a behavioral task. In each of the 448 trials, there are visual masks at the beginning and end of each trial, one visual prime (six possible primes, indicated by codes 1:6), and one target to which a response is made (six possible, indicated by codes 7:12). Accuracy to the target is noted by either 1 (correct) or 0 (incorrect).

I now need to add new codes to the Response rows. These news codes will have to be new numbers that represent each combination of target type and accuracy. For example, if there was a square target (code 12), and the participant was correct (Accuracy=1), the new code would be 25; if there was a square target (12) and they were incorrect (Accuracy=0), the new code would be 26, and so on for each combination of target (7:12) and accuracy (0:1) (12 combos total).

Trial	Label	Code	Accuracy
1	Pre-Mask	13	1
1	BluePrime	6	1
1	SqureTarget	12	1
1	Post-Mask	14	1
1	Response	NA	1
2	Pre-Mask	13	0
2	RedPrime	4	0
2	CircleTarget	9	0
2	Post-Mask	14	0
2	Response	NA	0

Basically, I need a conditional statement that replaces the NA's based on information from two different columns from two rows prior to the missing values. The output should look like this:

Trial	Label	Code	Accuracy
1	Pre-Mask	13	1
1	BluePrime	6	1
1	SqureTarget	12	1
1	Post-Mask	14	1
1	Response	25	1
2	Pre-Mask	13	0
2	RedPrime	4	0
2	CircleTarget	9	0
2	Post-Mask	14	0
2	Response	20	0

This is the table with all the new possible combinations:

When target is..	And Accuracy is...	New Code #
7	1	15
7	0	16
8	1	17
8	0	18
9	1	19
9	0	20
10	1	21
10	0	22
11	1	23
11	0	24
12	1	25
12	0	26

I have tried different combinations of lag() and case_when(), but as I start typing, I realize I don't really know how to structure the argument (I'm still very new to R and programming in general). I know I could manually input the numbers in Excel, but I'm hoping to automate this to some extent in R. I realize this information could also just be put into a new row pretty easily, but in order for the next program I am using to understand the code values, they all will need to be in the same column in the order in which they occur in real time (the response input will always follow the post-visual mask, code 14).

If any clarification is needed please let me know. Any help is appreciated!

Gregor Thomas · Accepted Answer

I think this follows your logic:

library(dplyr)
library(tidyr)

input %>%
  ## join the new values to the original data on their current rows
  left_join(lookup, by = c("Code", "Accuracy")) %>%
  ## do all subsequent operations within each Trial group
  group_by(Trial) %>%
  ## Fill in missing `New_Code` values based on the previous non-missing value
  fill(New_Code, .direction = "down") %>%
  ## replace any `Code` missing values with the `New_Code` value
  mutate(Code = coalesce(Code, New_Code))

# # A tibble: 10 x 5
# # Groups:   Trial [2]
# Trial Label         Code Accuracy New_Code
#                  
#  1     1 Pre-Mask        13        1       NA
#  2     1 BluePrime        6        1       NA
#  3     1 SqureTarget     12        1       25
#  4     1 Post-Mask       14        1       25
#  5     1 Response        25        1       25
#  6     2 Pre-Mask        13        0       NA
#  7     2 RedPrime         4        0       NA
#  8     2 CircleTarget     9        0       20
#  9     2 Post-Mask       14        0       20
# 10     2 Response        20        0       20

You can, of course, drop the New_Code column by adding ... %>% select(-New_Code) -- I leave it in above for clarity.

Using this data:

input = read.table(text = 'Trial    Label   Code    Accuracy
1   Pre-Mask    13  1
1   BluePrime   6   1
1   SqureTarget 12  1
1   Post-Mask   14  1
1   Response    NA  1
2   Pre-Mask    13  0
2   RedPrime    4   0
2   CircleTarget    9   0
2   Post-Mask   14  0
2   Response    NA  0', header = T)

lookup = read.table(text = 'Code    Accuracy    New_Code
7   1   15
7   0   16
8   1   17
8   0   18
9   1   19
9   0   20
10  1   21
10  0   22
11  1   23
11  0   24
12  1   25
12  0   26', header = TRUE)

Using case_when and lag() together?

Answers (2)

Related Questions