blazej
blazej

Reputation: 1788

R: import date and time from xlsx

Could anyone help me out with proper import settings for an excel file with a following kind of structure (for time series analysis):

label1  varName
label2  random text
label3  random text2
2015-01-01 01:00    85
2015-01-01 02:00    26
2015-01-01 03:00    15
2015-01-01 04:00    13
2015-01-01 05:00    22
2015-01-01 06:00    21
2015-01-01 07:00    13
2015-01-01 08:00    22
2015-01-01 09:00    20
2015-01-01 10:00    31
2015-01-01 11:00    36
2015-01-01 12:00    33
2015-01-01 13:00    33
2015-01-01 14:00    33

label and varName are rows to keep. Rows 2 and 3 should be deleted.

Database starts at Jan 1st 2015 1am and ends at 31dec 2015 11pm. For most days I have a value for each hour. There are some NAs inside values but nrow=8760

I'm still learning how to do time series in R, but I'd imagine that it would be easier to handle summaries (say day by day means) if R would split %Y-%m-%d %H:%M column into two separate ones.

Simple import with RStudio default readxl library fails as rows 2 and 3 get imported and date is translated into a funny format: 42005.041666666664

enter image description here

Normally I would deal with this by hand (in excel) and import a clean txt to R. Problem is I need to process 61 similar files (for different years and different variables). I'm sure there is a way to automate this task, but after 6hours of searching, testing and reading I'm basically in the same spot as this morning.

I'd appreciate any kind of hint or help. Thank you

Upvotes: 3

Views: 4193

Answers (2)

Enrico Schumann
Enrico Schumann

Reputation: 1493

You can convert the numbers back to a datetime format, such as POSIXct.

library("datetimeutils")
convert_date(42005.041666666664, type = "excel", fraction = TRUE)
## [1] "2015-01-01 01:00:00 CET"

Upvotes: 2

Pavel
Pavel

Reputation: 1127

Exel often uses different cell format to store date time values.

Example 1:

Cell value visible to user: 1/28/2019 7:00:00 AM 
Exel stores value in general format: 43493.2916666667

43493 - number of days after 1/1/1900
0.2916666667 - % of the day [7*60/(24*60)]

Example 2:

Cell value visible to user: 1/28/2019 7:23:33 AM
Exel stores value in general format: 43493.3080208333

0.3080208333 - % of the day [(7*60*60+23*60+33)/(24*60*60)]

Upvotes: 1

Related Questions