Mark Miller
Mark Miller

Reputation: 13103

locate dates of evenly-spaced events

I wish to locate the date of evenly-spaced events when given the number of events and the number of days in the period of interest. This seems like a trivial objective, but it is confusing me.

Here is a very simple example that has a straight-forward solution:

n.trips <-  5
n.days  <- 20

mean.trips.per.day <- n.trips / n.days

cummulative.trips <- mean.trips.per.day * c(1:n.days)
cummulative.trips
#[1] 0.25 0.50 0.75 1.00 1.25 1.50 1.75 2.00
#    2.25 2.50 2.75 3.00 3.25 3.50 3.75 4.00 4.25 4.50 4.75 5.00

# Find the date of each trip
which(cummulative.trips %in% c(1:n.days))
#[1]  4  8 12 16 20

But the following example is not straight-forward. Three possible solutions are shown but none match the desired result. In this example I am trying to pick out the locations of the six elements of the vector cummulative.trips that most closely match the integers 1:6. Those locations are shown in the vector desired.dates:

n.trips <-  6
n.days  <- 17

# Here are the desired results
date.of.first.trip   <-  3  # 1.0588235
date.of.second.trip  <-  6  # 2.1176471
date.of.third.trip   <-  8  # or 9: 2.8235294 3.1764706; 8 is the first 
date.of.fourth.trip  <- 11  # 3.8823529
date.of.fifth.trip   <- 14  # 4.9411765
date.of.sixth.trip   <- 17  # 6.0000000
desired.dates <- c(3,6,8,11,14,17)

mean.trips.per.day <- n.trips / n.days

cummulative.trips <- mean.trips.per.day * c(1:n.days)
cummulative.trips
#[1] 0.3529412 0.7058824 1.0588235 1.4117647 1.7647059
#    2.1176471 2.4705882 2.8235294 3.1764706 3.5294118
#    3.8823529 4.2352941 4.5882353 4.9411765 5.2941176 5.6470588 6.0000000

Here are three possible solutions I attempted:

# Find the date of each trip
which(cummulative.trips %in% c(1:n.days))
#[1] 17

which(round(cummulative.trips) %in% c(1:n.days))
#[1]  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17

round(seq(1, n.days, length = n.trips))
#[1]  1  4  7 11 14 17

EDIT

I tried this function suggested by MrFlick in a comment, but it simply returns a result that essentially matches the result of the first of three approaches I tried above for my second example.

What is the fastest way to check if a number is a positive natural number? (in R)

is.naturalnumber <-
function(x, tol = .Machine$double.eps^0.5)  x > tol & abs(x - round(x)) < tol

x <- cummulative.trips
is.naturalnumber(x)
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE

Upvotes: 0

Views: 113

Answers (2)

Mark Miller
Mark Miller

Reputation: 13103

After checking @MrFlick's answer with a number of combinations of n.trips and n.days I discovered a scenario where his code did not return the answer I expected (n.trips <- 26; n.days <- 13). His code returned, assuming I used it correctly:

[1]  1  1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 11 12 12 13

But I was expecting:

[1]  1  1  2  2  3  3  4  4  5  5  6  6  7  7  8  8  9  9 10 10 11 11 12 12 13 13

I probably should have explained my problem more clearly in my original post. I ended up writing the following for-loop and have tested it with 10 combinations of n.trips and n.days listed below. So far this for-loop seems to return what I expect for all 10 combinations. This code does incorporate @MrFlick's approach albeit is substantially modified form.

mean.trips.per.day <- n.trips / n.days
mean.trips.per.day

cummulative.trips.by.day <- mean.trips.per.day * c(1:n.days)
cummulative.trips.by.day

date.of.trip <- rep(0, n.trips)

for(i in 1:n.trips) {

     trip.candidate.days <- which(round(cummulative.trips.by.day) >= i)

     if(length(trip.candidate.days) >  0) date.of.trip[i] = trip.candidate.days[which.min(abs(cummulative.trips.by.day[trip.candidate.days] - i))]

     # no dates have a value that rounds to >= i which suggests there was at most i-1 trips
     if(length(trip.candidate.days) == 0) date.of.trip[i] = 0

}

cummulative.trips.by.day

date.of.trip

Here are the 10 combinations of n.trips and n.days I have used so far to test this code.

n.trips <- 12
n.days  <- 12

n.trips <-  6
n.days  <- 12

n.trips <-  5
n.days  <- 13

n.trips <- 26
n.days  <- 13

n.trips <- 28
n.days  <- 13

n.trips <- 20
n.days  <- 13

n.trips <-  0
n.days  <- 13

n.trips <-  1
n.days  <- 13

n.trips <-  2
n.days  <- 13

n.trips <- 100
n.days  <-  23

Upvotes: 0

MrFlick
MrFlick

Reputation: 206187

Perhaps something like this will work

nearest_index <- function(targets, values) {
    sapply(targets, function(x) which.min(abs(values-x)))
}
nearest_index(1:6, cummulative.trips)
# [1]  3  6  8 11 14 17

For each "target" value, we find the value that minimizes the difference between the observed values.

Upvotes: 1

Related Questions