Reputation:
Having this data:
dates
1990-02-02 01:00:00
1990-02-02 02:00:00
1990-02-03 01:00:00
1990-02-03 02:00:00
1990-02-04 01:00:00
1990-02-04 02:00:00
Could I create a column, in which every row has the number of day like this?
number dates
1 1990-02-02 01:00:00
1 1990-02-02 02:00:00
2 1990-02-03 01:00:00
2 1990-02-03 02:00:00
3 1990-02-04 01:00:00
3 1990-02-04 02:00:00
Upvotes: 1
Views: 96
Reputation: 193517
This solution assumes that the dates are listed in chronological order (or are at least sorted by day) and that the dates are actually formatted as date time objects.
temp <- data.frame(dates = c('1990-02-02 01:00:00',
'1990-02-02 02:00:00',
'1990-02-03 01:00:00',
'1990-02-03 02:00:00',
'1990-02-04 01:00:00',
'1990-02-04 02:00:00',
'1990-02-04 03:00:00'))
temp$dates <- as.POSIXct(temp$dates, tz = "GMT")
x <- table(as.Date(temp$dates))
temp$number <- rep(seq_along(x), x)
temp
# dates number
# 1 1990-02-02 01:00:00 1
# 2 1990-02-02 02:00:00 1
# 3 1990-02-03 01:00:00 2
# 4 1990-02-03 02:00:00 2
# 5 1990-02-04 01:00:00 3
# 6 1990-02-04 02:00:00 3
# 7 1990-02-04 03:00:00 3
The basic idea is to just strip the time out (using as.Date
) and tabulate the frequency of each day. You can then use rep
on that output to create your "number" variable.
Slap forehead and post a simpler solution
Just use factor
on as.Date
. This would work even on a data.frame
where the data are not ordered:
temp <- data.frame(dates = c('1990-02-02 01:00:00',
'1990-02-02 02:00:00',
'1990-02-03 01:00:00',
'1990-02-03 02:00:00',
'1990-02-04 01:00:00',
'1990-02-04 02:00:00',
'1990-02-04 03:00:00'))
temp$dates <- as.POSIXct(temp$dates, tz = "GMT")
within(temp, {
counts <- as.numeric(factor(as.Date(dates)))
})
# dates counts
# 1 1990-02-02 01:00:00 1
# 2 1990-02-02 02:00:00 1
# 3 1990-02-03 01:00:00 2
# 4 1990-02-03 02:00:00 2
# 5 1990-02-04 01:00:00 3
# 6 1990-02-04 02:00:00 3
# 7 1990-02-04 03:00:00 3
Upvotes: 2
Reputation: 6146
The rank function should do what you need but consult the documentation for it as there's not an easily reproducible exampe for me to play with
temp <- data.frame(dates = c('1990-02-02 01:00:00',
'1990-02-02 02:00:00',
'1990-02-03 01:00:00',
'1990-02-03 02:00:00',
'1990-02-04 01:00:00',
'1990-02-04 02:00:00',
'1990-02-04 03:00:00'))
temp$datesonly<-as.Date(temp$dates)
temp2<-data.frame(dates=unique(temp$datesonly),ranks=rank(unique(temp$datesonly),ties.method="first"))
temp<-merge(temp,temp2,by.x="datesonly",by.y="dates")
Upvotes: 2
Reputation: 19
Tries with this...
- library(lubridate)
date <- c("1990-02-02 01:00:00",
"1990-02-02 02:00:00",
"1990-02-03 01:00:00",
"1990-02-03 02:00:00",
"1990-02-04 01:00:00",
"1990-02-04 02:00:00")
number <- day(date)
cbind.data.frame(number,date)
Upvotes: 0