Reputation: 105
I have a data frame named data_frame like this
period X Y Z
1 10 20 12
2 15 14 17
3 20 35 29
4 22 45 25
5 27 16 28
6 12 20 25
Now I want to generate another df by having the maximum value of first two, second two, third period two where expected output is like this
period X Y Z
1 15 20 17
2 22 45 29
3 27 20 28
Upvotes: 0
Views: 40
Reputation: 76450
The following will do what you want.
df2 <- data_frame
df2$fact <- (data_frame$period + 1) %/% 2
df <- aggregate(. ~ fact, df2, max)
rm(df2)
df <- df[, -2]
names(df)[1] <- names(data_frame)[1]
df
# period X Y Z
#1 1 15 20 17
#2 2 22 45 29
#3 3 27 20 28
DATA
data_frame <-
structure(list(period = 1:6, X = c(10L, 15L, 20L, 22L, 27L, 12L
), Y = c(20L, 14L, 35L, 45L, 16L, 20L), Z = c(12L, 17L, 29L,
25L, 28L, 25L)), .Names = c("period", "X", "Y", "Z"), class = "data.frame", row.names = c(NA,
-6L))
EDIT.
If you want periods of length 3 adjust fact
accordingly.
df2$fact <- (data_frame$period + 2) %/% 3
More generally, for periods of length n
,
df2$fact <- (data_frame$period + n - 1) %/% n
Upvotes: 1
Reputation: 2496
with tidyverse
, you can use something like:
data_frame %>%
mutate(k = period) %>%
mutate(k = if_else( (mod(k,2)==0), lag(k,1), k)) %>%
group_by(k) %>%
summarise_all(max)
and then do a select
to keep only the required columns.
Upvotes: 0
Reputation: 1257
You can use dplyr
and an additional index column to group by
and find the maximum:
d <- data_frame(X = sample(1:20, 6, FALSE),
Y = sample(1:20, 6, FALSE),
Z = sample(1:20, 6, FALSE))
d
# A tibble: 6 x 4
X Y Z index
<int> <int> <int> <dbl>
1 15 12 12 1
2 19 5 18 1
3 14 6 20 2
4 20 8 11 2
5 6 13 16 3
6 1 17 9 3
library(dplyr)
d %>%
mutate(index = c(1,1,2,2,3,3)) %>%
group_by(index) %>%
summarise_all(max)
# A tibble: 3 x 4
index X Y Z
<dbl> <dbl> <dbl> <dbl>
1 1 19 12 18
2 2 20 8 20
3 3 6 17 16
Upvotes: 1