xiaodai
xiaodai

Reputation: 16004

R: How to assign a running counter to each unique value in a vector?

I have come across this question. I wanted to identify the day of the second Sunday of each month for the next 100 years. This is my code

x <- seq(as.Date("2014-9-01"),as.Date("2014-9-01")+100*365.25,1)

y <- format(x,"%Y%m")

xx <- NULL
for(i in unique(y)) {
  w <- which(y == i)
  xx <- c(xx,x[w[which(weekdays(x[w]) == "Sunday")[2]]])
}

head(xx)
tail(xx)

I have achieved it but I had to use a loop. How do I do this more efficiently with vectorised code?

In general, suppose there is a vector v with n distinct values, how do I assign an increasing value to each distinct value of v starting with 1 for each distinct value. That is, suppose I start with a vector

v <- c(1,1,1,2,2,2,2,3,4,4)

and I want to generate a "running counter", v.counter, of the unique values in v

v.counter <- c(1,2,3,1,2,3,4,1,1,2)

obviously I can write a loop to do this. But how do I do this with vectorised code instead?

Upvotes: 2

Views: 1785

Answers (5)

MrFlick
MrFlick

Reputation: 206187

This should be fairly simple using the ave() function for generative group-specific values.

ave(v, v, FUN=seq_along)
# [1] 1 2 3 1 2 3 4 1 1 2

Should you want to only look at consecutive sequences and not unique values in v you could so something like this as well

v <- c(1,1,1,2,2,2,2,1,2,2)
ave(v, with(rle(v), rep(1:length(lengths), lengths)), FUN=seq_along)
# [1] 1 2 3 1 2 3 4 1 1 2

which gives the same values despite the fact there are only two distinct values used in v. The first solution would have continued counting where the 1's left off the second time they were encountered. Also, if v isn't numeric, you can do

v <- rep(letters[1:4], c(3,4,1,2))
ave(seq_along(v), v, FUN=seq_along)
# [1] 1 2 3 1 2 3 4 1 1 2

to still get numeric values.

Upvotes: 4

jazzurro
jazzurro

Reputation: 23574

There are many good answers. I leave the following to get the 2nd Sunday of each month for next 100 years. I am sure there are better ways of handling date-class object. But this works too.

library(lubridate)
library(dplyr)
library(tidyr)

x <- seq(as.Date("2014-9-01"),as.Date("2014-9-01")+100*365.25,1)
weekday <- wday(x)
foo <- data.frame(x, weekday, stringsAsFactors = FALSE)


ana <- foo %>%
    separate(x, c("year", "month", "date"), sep = "-") %>%
    filter(weekday == 1) %>%
    group_by(year, month) %>%
    filter(row_number() == 2) %>%
    unite(sunday, year, month, date, sep = "-") %>%
    mutate(sunday = as.Date(sunday)) %>% ### If you want date object
    select(sunday) ### If you want just one column

head(ana)
Source: local data frame [6 x 1]
      sunday
1 2014-09-14
2 2014-10-12
3 2014-11-09
4 2014-12-14
5 2015-01-11
6 2015-02-08

Upvotes: 2

Tom Martens
Tom Martens

Reputation: 776

Just for the sake of completion I want to add the data.table solution

dt <- data.table(x,y) dt[, wd := weekdays(x)] dt <- dt[, wdidx := seq_along(.I), by = c("y", "wd")][wd == "Sonntag" & wdidx == 2,] head(dt, 20)

"Sonntag" means sunday, the intricate working of weekdays()returning the locale of the weekday

Upvotes: 1

Alex
Alex

Reputation: 15708

Suppose we have a data frame containing v:

data <- data.frame(v = c(1,1,1,2,2,2,2,3,4,4))

Then, using dplyr

library(dplyr)
data %>%
    group_by(v) %>%
    mutate(v.counter = row_number())

Upvotes: 3

eipi10
eipi10

Reputation: 93761

You can do the running count with dplyr:

library(dplyr)

dat = data.frame(x=rep(1:10, each=3))

dat = dat %>%
  group_by(x) %>%
  mutate(x_count=1:n())

    x x_count
1   1       1
2   1       2
3   1       3
4   2       1
5   2       2
6   2       3
...
25  9       1
26  9       2
27  9       3
28 10       1
29 10       2
30 10       3

Upvotes: 8

Related Questions