Reputation: 81
I would like to know how to replace observations in a series of matrices that have rules for replacement that differ by matrix column in as automated a way as possible.
I have a large number of matrices to process (~240). Processing requires replacing observations over a certain number with 0. Each matrix has identical dimensions and contains two types of columns, which require a different cut-off at which all observations in that column should be replaced with 0. Matrices differ based on the cut-off at which observations should be replaced with 0. I have figured out a way to process the data but it is very slow and manual, what I would like to know is if there is a way to automate this either through a function or a for loop.
For instance, in the matrix x
where observations in columns 3,4 and 9 need to become 0 if greater than 45 and observations in all other columns need to become 0 if greater than 70.
I have thought of two ways to replace the data, both detailed below:
cols <- c(3,4,9)
x <- matrix(sample(1:100),10,10)
#attempt 1
x[x>70] <- 0 # replace all values > 70 with 0 (true for all columns)
x.sub <- x[,cols] # subset columns with lower limit
x.sub[x.sub>45] <- 0 # replace all values > 45 with 0 for the subset (rule 2 columns)
x[,cols] <- x.sub # add revised values back to main object
#attempt 2
x.sub1 <- x[,-cols] # subset columns that follow rule 1
x.sub1[x.sub1>70] <- 0 # replace all values over 70 with 0
x.sub2 <- x[,cols] # subset columns that follow rule 2
x.sub2[x.sub3>45] <- 0 # replace all values over 45 with 0
x[,-cols] <- x.sub1 # add revised values back to main object (rule 1 columns)
x[,cols] <- x.sub2 # add revised values back to main object (rule 2 columns)
Any help would be greatly appreciated!!
Upvotes: 1
Views: 563
Reputation: 886948
We could also this with negative indexing
x[, -cols] <- (x[, -cols] <=70) * x[, -cols]
x[, cols] <- (x[, cols] <= 45) * x[, cols]
x
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,] 0 27 0 0 35 20 24 33 0 38
# [2,] 0 36 0 12 19 17 40 44 18 52
# [3,] 0 10 0 15 60 43 0 0 0 0
# [4,] 2 0 32 0 9 63 0 48 16 69
# [5,] 0 0 5 30 0 47 57 0 45 0
# [6,] 39 61 0 0 0 0 37 56 23 46
# [7,] 0 51 0 42 0 3 28 7 26 11
# [8,] 6 0 0 25 65 13 0 49 0 4
# [9,] 0 0 0 0 0 8 31 0 14 34
#[10,] 0 0 0 29 22 0 1 66 21 41
cols <- c(3,4,9)
set.seed(69)
x <- matrix(sample(1:100), 10, 10)
Upvotes: 1
Reputation: 173793
If we make a reproducible matrix:
cols <- c(3,4,9)
set.seed(69)
x <- matrix(sample(1:100), 10, 10)
x
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 96 27 74 53 35 20 24 33 50 38
#> [2,] 81 36 54 12 19 17 40 44 18 52
#> [3,] 91 10 55 15 60 43 88 71 58 82
#> [4,] 2 92 32 70 9 63 87 48 16 69
#> [5,] 77 78 5 30 79 47 57 72 45 89
#> [6,] 39 61 85 75 80 93 37 56 23 46
#> [7,] 95 51 67 42 99 3 28 7 26 11
#> [8,] 6 76 62 25 65 13 86 49 59 4
#> [9,] 97 100 68 83 73 8 31 94 14 34
#> [10,] 90 84 64 29 22 98 1 66 21 41
We can get the indices of the entries that are in the target columns using a little modular arithmetic:
incols <- ((seq_along(x) - 1) %/% nrow(x) + 1) %in% cols
Which allows straightforward subsetting:
x[x > 70 & !incols] <- 0
x[x > 40 & incols] <- 0
x
#> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,] 0 27 0 0 35 20 24 33 0 38
#> [2,] 0 36 0 12 19 17 40 44 18 52
#> [3,] 0 10 0 15 60 43 0 0 0 0
#> [4,] 2 0 32 0 9 63 0 48 16 69
#> [5,] 0 0 5 30 0 47 57 0 0 0
#> [6,] 39 61 0 0 0 0 37 56 23 46
#> [7,] 0 51 0 0 0 3 28 7 26 11
#> [8,] 6 0 0 25 65 13 0 49 0 4
#> [9,] 0 0 0 0 0 8 31 0 14 34
#> [10,] 0 0 0 29 22 0 1 66 21 41
Created on 2020-07-03 by the reprex package (v0.3.0)
Upvotes: 1