Marinka
Marinka

Reputation: 1269

How to delete certain dates in the data frame R

I have a quite large dataset (1.295.897) form the water-level in the North Sea. This is a very nice dataset, but from the years 1978-1987 they have measured the water-level every hour and from 1988 they measured the water-level every 10 minutes. I do not need the measurements every 10 minutes, so I would like to remove the measurement for every 10 minutes except the one at exactly the hour (e.g. 10:00, 1:00).

This is how my data looks like from 1978 to 1987:

  posix                  waarde
1 1978-01-01 00:00:00     66
2 1978-01-01 01:00:00     51
3 1978-01-01 02:00:00     17
4 1978-01-01 03:00:00    -17
5 1978-01-01 04:00:00    -46
6 1978-01-01 05:00:00    -69

And this is how my dataset looks like from 1988 until 2010:

        posix               waarde
1295892 2010-12-31 23:00:00    -73
1295893 2010-12-31 23:10:00    -71
1295894 2010-12-31 23:20:00    -68
1295895 2010-12-31 23:30:00    -64
1295896 2010-12-31 23:40:00    -59
1295897 2010-12-31 23:50:00    -53

I hope that you can help me.

Upvotes: 2

Views: 855

Answers (1)

Ari B. Friedman
Ari B. Friedman

Reputation: 72731

Reproducible example please. But if your variable is actually a POSIX class, then:

library(lubridate)
dat[ minute(dat$posix)==0, ]

The beauty of lubridate is that it handles the details for you:

> test <- as.POSIXlt(Sys.time(), "GMT")
> test
[1] "2013-09-26 17:50:16 GMT"
> minute(test)
[1] 50

If you need to rule out things not ending exactly on the hour to the second:

dat[ minute(dat$posix)==0 & second(dat$posix==0), ]

You may want to do some rounding on the second part, as there are decimal seconds reported also:

> second(test)
[1] 16.54902

Upvotes: 2

Related Questions