Reputation: 14309
I have a data frame of the following form:
Year Value
===========
1999 A
2000 A
2001 B
2002 B
2003 A
2004 A
I need to get the following output:
"Min Year" "Max Year" Value
============================
1999 2000 A
2001 2002 B
2003 2004 A
Thus, I have grouping duplicates and using aggregate
out-of-the-box won't work:
> df
Year Value
1 1999 A
2 2000 A
3 2001 B
4 2002 B
5 2003 A
6 2004 A
> aggregate(Year ~ Value, df, min)
Value Year
1 A 1999
2 B 2001
> aggregate(Year ~ Value, df, max)
Value Year
1 A 2004
2 B 2002
What function would help accomplish this without using loops?
Upvotes: 0
Views: 75
Reputation: 92282
Or using rle
(only once!)
reps <- rle(as.character(df$Value))
temp <- data.frame(t(sapply(c(1, cumsum(reps$lengths)), function(x) range(df$Year[(x - 1) : x])))[-1, ], reps$values)
colnames(temp) <- c("Min Year", "Max Year", "Value")
# Min Year Max Year Value
# 1 1999 2000 A
# 2 2001 2002 B
# 3 2003 2004 A
Upvotes: 2
Reputation: 886938
Create an index
indx <- setNames(cumsum(c(T,df$Value[-1]!=df$Value[-length(df$Value)])),df$Value)
?aggregate on that index
res <- aggregate(Year~indx, df,FUN=function(x) c(Min=min(x), Max=max(x)))
res$indx <- names(res$indx)
res[,2:3] <- as.data.frame(res[,2])
change the colnames accordingly
colnames(res) <- c("Value", "Min Year", "Max Year")
res
# Value Min Year Max Year
# 1 A 1999 2000
# 2 B 2001 2002
# 3 A 2003 2004
Upvotes: 2