add-semi-colons
add-semi-colons

Reputation: 18810

R Read CSV file that has timestamp

I have a csvfile that has a time stamp column as a string

15,1035,4530,3502,2,892,482,0,20060108081608,2,N
15,1034,7828,3501,3,263,256,0,20071124175519,3,N
15,1035,7832,4530,2,1974,1082,0,20071124193818,7,N
15,2346,8381,8155,3,2684,649,0,20080207131002,9,N

I use the read.csv option but the problem with that is once I finish the import the data column looks like:

1       15     1035   4530          3502       2        892   482       0 2.006011e+13          2          N
2       15     1034   7828          3501       3        263   256       0 2.007112e+13          3          N
3       15     1035   7832          4530       2       1974  1082       0 2.007112e+13          7          N
4       15     2346   8381          8155       3       2684   649       0 2.008021e+13          9          N

Is there away to strip the date from string as it get read (csv file does have headers: removed here to keep data anonymous). If we can't strip as it get read can what is the best way to do the strip?

Upvotes: 3

Views: 4044

Answers (3)

Blue Phoenix
Blue Phoenix

Reputation: 11

You can use lubridate package

test <- '20080207131002'
lubridate::as_datetime(test)

Can also specify format for each case depends on your needs

Upvotes: 0

Adam Hyland
Adam Hyland

Reputation: 1057

As Ricardo says, you can set the column classes with read.csv. In this case I recommend importing these as characters and once the csv is loaded, converting them to dates with strptime().

for example:

test <- '20080207131002'
strptime(x = test, format = "%Y%m%d%H%M%S")

Which will return a POSIXlt object w/ the date/time info.

Upvotes: 0

agstudy
agstudy

Reputation: 121568

Here 2 methods:

Using zoo package. Personally I prefer this one. I deal with your data as a time series.

library(zoo)

read.zoo(text='15,1035,4530,3502,2,892,482,0,20060108081608,2,N
15,1034,7828,3501,3,263,256,0,20071124175519,3,N
15,1035,7832,4530,2,1974,1082,0,20071124193818,7,N
15,2346,8381,8155,3,2684,649,0,20080207131002,9,N',
         index=9,tz='',format='%Y%m%d%H%M%S',sep=',')

                   V1 V2   V3   V4   V5 V6   V7   V8 V10 V11
2006-01-08 08:16:08 15 1035 4530 3502 2   892  482 0  2   N  
2007-11-24 17:55:19 15 1034 7828 3501 3   263  256 0  3   N  
2007-11-24 19:38:18 15 1035 7832 4530 2  1974 1082 0  7   N  
2008-02-07 13:10:02 15 2346 8381 8155 3  2684  649 0  9   N  

Using colClasses argument in read.table, as mentioned in the comment :

dat <- read.table(text='15,1035,4530,3502,2,892,482,0,20060108081608,2,N
15,1034,7828,3501,3,263,256,0,20071124175519,3,N
15,1035,7832,4530,2,1974,1082,0,20071124193818,7,N
15,2346,8381,8155,3,2684,649,0,20080207131002,9,N',
           colClasses=c(rep('numeric',8),
                        'character','numeric','character')
                        ,sep=',')


strptime(dat$V9,'%Y%m%d%H%M%S')

1] "2006-01-08 08:16:08" "2007-11-24 17:55:19" 
  "2007-11-24 19:38:18" "2008-02-07 13:10:02"

Upvotes: 2

Related Questions