Reputation: 3
In the table below, how can I find the two highest values in each row, then add these values together?
I have a copy of the attached table in RStudio. Is there a line of code I can use to add the two highest numbers on each row, so that I can apply to this to a much bigger dataset?
Upvotes: 0
Views: 1641
Reputation: 225
If you prefer one-liner, take that one
df <- data.frame(Mon = c(12,15,42,43,56,73,23),
Tues = c(15,14,12,75,98,79,68),
Wed = c(13,42,35,64,35,95,56),
Thur = c(23,46,32,94,78,68,35),
Friday = c(25,23,64,35,27,54,32))
sol <- apply(df, 1, function(x) sum(max(x), max(x[-which(x == max(x))[1]])))
Note that there are slightly more efficient ways. Lmk if you wanna hear them
Upvotes: 1
Reputation: 1724
You can do a rowwise calculation where you sort the values in the specified columns and sum the highest two:
library(dplyr)
df <- data.frame(Mon = c(12,15,42,43,56,73,23),
Tues = c(15,14,12,75,98,79,68),
Wed = c(13,42,35,64,35,95,56),
Thur = c(23,46,32,94,78,68,35),
Friday = c(25,23,64,35,27,54,32))
df %>%
rowwise() %>%
mutate(two_max = sum(sort(c(Mon, Tues, Wed, Thur, Friday), decreasing = TRUE)[1:2])) %>%
ungroup()
If you don't want to specify the column names manually you can also select all numeric columns at once:
df %>%
rowwise() %>%
mutate(two_max = sum(sort(c_across(where(is.numeric)), decreasing = TRUE)[1:2])) %>%
ungroup()
Both strategies give the result:
# A tibble: 7 x 6
Mon Tues Wed Thur Friday two_max
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 12 15 13 23 25 48
2 15 14 42 46 23 88
3 42 12 35 32 64 106
4 43 75 64 94 35 169
5 56 98 35 78 27 176
6 73 79 95 68 54 174
7 23 68 56 35 32 124
Upvotes: 4
Reputation: 2301
Using apply
with mtcars as an example:
top2 <- x <- apply(mtcars, 1, function(x) sort(x, decreasing = TRUE)[1:2])
top2 <- matrix(top2, ncol = 2, byrow = TRUE)
addem <- rowSums(top2)
top2plus <- cbind(mtcars, addem)
head(top2plus, 5)
mpg cyl disp hp drat wt qsec vs am gear carb addem
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4 270
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4 270
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1 201
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1 368
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2 535
Upvotes: 0
Reputation: 1291
One way to do this would be to reshape the dataframe to have tidy data (i.e. the days as a variable, the values as another), and use the dplyr verbs to process the data into a summary:
df <- data.frame(ID = LETTERS[1:7],
Mon = c(12, 15, 42, 43, 56, 73, 23),
Tues = c(15, 14, 12, 75, 98, 79, 68),
Wed = c(13, 42, 35, 64, 35, 95, 56),
Thur = c(23, 46, 32, 94, 78, 68, 35),
Friday = c(25, 23, 64, 35, 27, 54, 32))
library(dplyr)
library(tidyr)
df %>%
pivot_longer(-ID) %>% # reshape
group_by(ID) %>% # change scope to each ID
slice_max(value, n = 2) %>% # keep two maximums
summarise(max2 = sum(value)) # sum them
#> # A tibble: 7 x 2
#> ID max2
#> <chr> <dbl>
#> 1 A 48
#> 2 B 88
#> 3 C 106
#> 4 D 169
#> 5 E 176
#> 6 F 174
#> 7 G 124
Created on 2020-12-01 by the reprex package (v0.3.0)
Upvotes: 0