amisos55
amisos55

Reputation: 1979

recoding a numerical variable based on a specific criterion in r

I would like to recode a numerical variable based on a cut score criterion. If the cut scores are not available in the variable, I would like to recode the closest smaller value as a cut score. Here is a snapshot of dataset:

ids <- c(1,2,3,4,5,6,7,8,9,10)
scores <- c(512,531,541,555,562,565,570,572,573,588)
data <- data.frame(ids, scores)
> data
   ids scores
1    1    512
2    2    531
3    3    541
4    4    555
5    5    562
6    6    565
7    7    570
8    8    572
9    9    573
10  10    588

cuts <- c(531, 560, 575)

The first cut score (531) is in the dataset. So it will stay the same as 531. However, 560 and 575 were not available. I would like to recode the closest smaller value (555) to the second cut score as 560 in the new column, and for the third cut score, I'd like to recode 573 as 575.

Here is what I would like to get.

   ids scores  rescored
1    1    512   512
2    2    531   531
3    3    541   541
4    4    555   560
5    5    562   562
6    6    565   565
7    7    570   570
8    8    572   572
9    9    573   575
10  10    588   588

Any thoughts? Thanks

Upvotes: 1

Views: 86

Answers (1)

akrun
akrun

Reputation: 887048

One option would be to find the index with findInterval and then get the pmax of the 'scores' corresponding to that index with the 'cuts' and updated the 'rescored' column elements on that index

i1 <- with(data, findInterval(cuts, scores))
data$rescored <- data$scores
data$rescored[i1] <- with(data, pmax(scores[i1], cuts))
data
#   ids scores rescored
#1    1    512      512
#2    2    531      531
#3    3    541      541
#4    4    555      560
#5    5    562      562
#6    6    565      565
#7    7    570      570
#8    8    572      572
#9    9    573      575
#10  10    588      588

Upvotes: 2

Related Questions