Reputation: 21
I have a large .txt data file and I need to subset based on a date range.
head(newFile)
Date Time Global_active_power Global_reactive_power Voltage Global_intensity
1 16/12/2006 17:24:00 4.216 0.418 234.84 18.4
2 16/12/2006 17:25:00 5.360 0.436 233.63 23.0
3 16/12/2006 17:26:00 5.374 0.498 233.29 23.0
4 16/12/2006 17:27:00 5.388 0.502 233.74 23.0
5 16/12/2006 17:28:00 3.666 0.528 235.68 15.8
6 16/12/2006 17:29:00 3.520 0.522 235.02 15.0
Sub_metering_1 Sub_metering_2 Sub_metering_3
1 0 1 17
2 0 1 16
3 0 2 17
4 0 1 17
5 0 1 17
6 0 2 17
I only need to use the data from the dates 2007-02-01 and 2007-02-02.
I think I would need to convert the Date and Time variables to Date/Time classes in R using strptime()
and as.Date()
functions, but I'm not clear on how to do that.
What is the simplest/cleanest way to do this?
Upvotes: 1
Views: 903
Reputation: 1585
You can use lubridate library, this code is just example, I make a little change from your data
library(lubridate)
> df <- read.table("test2.txt", header=TRUE)
> df
Date Time Global_active_power Global_reactive_power Voltage
1 16/12/2006 17:24:00 4.216 0.418 234.84
2 16/12/2006 17:25:00 5.360 0.436 233.63
3 16/12/2007 17:26:00 5.374 0.498 233.29
4 16/12/2007 17:27:00 5.388 0.502 233.74
5 16/12/2006 17:28:00 3.666 0.528 235.68
Global_intensity
1 18.4
2 23.0
3 23.0
4 23.0
5 15.8
> date1 = dmy("04/06/2007")
> date2 = dmy("04/06/2009")
> with( df , df[ dmy(df$Date) >= date1 ,dmy(df$Date) <= date2 ] )
Date Time Global_active_power Global_reactive_power Voltage
3 16/12/2007 17:26:00 5.374 0.498 233.29
4 16/12/2007 17:27:00 5.388 0.502 233.74
Global_intensity
3 23
4 23
>
Upvotes: 3