Reputation: 25

How to find the largest range from a series of numbers using R?

I have a data set where length and age correspond with individual items (ID #), there are 4 different items, you can see on the data set below.

range(dataset$length)

gives me the overall range of the length for all items. But I need to compare ranges to determine which item (ID #) has the largest range in length relative to the other 3.

 length age  ID #
3.5      5    1
7       10    1
10      15    1
4        5    2
8       10    2
13      15    2
3       5     3
7       10    3
9       15    3
4       5     4
5       10    4
7       15    4

Upvotes: 1

Answers (4)

IRTFM

Reputation: 263451

This gives you the differences in ranges:

lapply( with(dat, tapply(length, ID, range)), diff)

And you can wrap which.max around htat list to give you the ID associated with the largest value:

which.max( lapply( with(dat, tapply(length, ID, range)), diff) )
2 
2

Upvotes: 2

spf614

Reputation: 52

An easy approach which doesn't use dplyr, though perhaps less elegant, is the which function.

range(dataset$length[which(dat$id == 1)])
range(dataset$length[which(dat$id == 2)])
range(dataset$length[which(dat$id == 3)])
range(dataset$length[which(dat$id == 4)])

You could also make a function that gives you the actual range (the difference between the max and the means) and use lapply to show you the IDs paired with their ranges.

largest_range <- function(id){
    rbind(id, 
    (max(data$length[which(data$id == id)]) - 
        min(data$length[which(data$id == id)])))
}

lapply(X = unique(data$id), FUN = largest_range)

Upvotes: 0

Joshua Rosenberg

Reputation: 4226

group_by in dplyr may be helpful:

library(dplyr)

dataset %>%
    group_by(ID) %>%
    summarize(ID_range = n())

The above code is equivalent to the following (it's just written with %>%):

library(dplyr)

dataset <- group_by(dataset, ID)
summarize(dataset, ID_range = n())

Upvotes: 0

HubertL

Reputation: 19544

In base R:

mins <- tapply(df$length, df$ID, min)
maxs <- tapply(df$length, df$ID, max)
unique( df$ID)[which.max(maxs-mins)]

Upvotes: 1

How to find the largest range from a series of numbers using R?

Answers (4)

Related Questions