Reputation: 105
I have a dataframe with test values of 20 different teststeps from some units (e.g. unit 1..5) In addition I have 2 columns with lower and upper limit (ll and ul)
data <- data.frame(
ll=round(runif(20,5,15)),
ul=round(runif(20,80,95)),
unit1=sample(1:100,20,TRUE),
unit2=sample(1:100,20,TRUE),
unit3=sample(1:100,20,TRUE),
unit4=sample(1:100,20,TRUE),
unit5=sample(1:100,20,TRUE)
)
How do I get a count of values for each row which are below the lower limit (ll) and what count is above the upper limit (ul)?
Upvotes: 2
Views: 1306
Reputation: 193547
I would use within()
for this, following the same logic as @Jilber's answer:
# First, use set.seed to make your example reproducible
set.seed(1)
data <- data.frame(
ll=round(runif(5, 5, 15)),
ul=round(runif(5, 80, 95)),
unit1=sample(1:100, 5, TRUE),
unit2=sample(1:100, 5, TRUE),
unit3=sample(1:100, 5, TRUE),
unit4=sample(1:100, 5, TRUE),
unit5=sample(1:100, 5, TRUE)
)
data
# ll ul unit1 unit2 unit3 unit4 unit5
# 1 8 93 21 50 94 39 49
# 2 9 94 18 72 22 2 60
# 3 11 90 69 100 66 39 50
# 4 14 89 39 39 13 87 19
# 5 7 81 77 78 27 35 83
The within
function lets you add new columns in a convenient way.
within(data, {
below = rowSums(data[-c(1:2)] < ll)
above = rowSums(data[-c(1:2)] > ul)
})
# ll ul unit1 unit2 unit3 unit4 unit5 above below
# 1 8 93 21 50 94 39 49 1 0
# 2 9 94 18 72 22 2 60 0 1
# 3 11 90 69 100 66 39 50 1 0
# 4 14 89 39 39 13 87 19 0 1
# 5 7 81 77 78 27 35 83 1 0
Alternatively, you can also use transform()
to achieve the same output:
transform(data,
below = rowSums(data[-c(1:2)] < ll),
above = rowSums(data[-c(1:2)] > ul))
Benchmarking the Jilber's solution and these two on a 2,000,000 row dataset, here are the results:
test replications elapsed relative user.self sys.self
3 jilber 3 33.586 1.000 31.490 1.916
1 within 3 34.493 1.027 32.542 1.584
2 transform 3 33.813 1.007 31.870 1.828
I think these two functions hold up pretty well for the convenience they offer!
Upvotes: 1
Reputation: 61174
Try this:
below <- rowSums(data[,-c(1,2)] < data[,'ll']) # below
above <- rowSums(data[,-c(1,2)] > data[,'ul']) # above
DF <- cbind(data, below, above) # altogether
head(DF)
ll ul unit1 unit2 unit3 unit4 unit5 below above
1 10 82 75 10 24 80 60 0 0
2 13 87 60 20 4 94 31 1 1
3 8 85 58 9 98 94 65 0 2
4 11 95 68 45 5 38 76 1 0
5 11 87 15 79 5 43 67 1 0
6 11 89 33 6 18 1 22 2 0
Upvotes: 1