Reputation: 11
I'm rather new at R so I appreciate your help in advance.
I have an excel spreadsheet dataset including daily, closing prices for a financial asset ranging from Jan 3rd 2000 all the way to Mar 6th 2019. I am looking to use ARIMA models to forecast future prices for this asset but cannot input the data in R.
Simply put, I have an Excel file with 2 columns and 4736 rows as per below:
03/jan/00 1,8358
04/jan/00 1,8702
05/jan/00 1,8520
.
.
.
01/mar/19 3,7824
06/mar/19 3,8372
Right after reading the file using the command read_excel(file.choose())
I've tried using ts commands to read the data as a time series. However considering data is only collected during market closing times on business days, this does not produce the desired results. Also tried researching and read about xts
and zoo
-type series but do not seem to be able to make these work. Please note that this data excludes weekends and holidays, which seems to be the root of the problem here.
my_data <- read_excel(file.choose())
series <- ts(my_data, start=c(2000,1,3), end=c(2019,3,6), frequency = 252)
By using ts considering an average 252 business days per year, my time series ends on a different date as it should.
What kind of steps should I consider in order to 1) read the data from the excel file and 2) read the imported data in R as a time series?
Also thought about, instead of using dates as the first column, use numbers from 1 through 4736 to eliminate this necessity but it seems the first column is required to have the YYYY-m-d format.
Thanks a ton
Upvotes: 1
Views: 1175
Reputation: 1906
1) read_excel
should read it in as a tibble
. In case dates are read as values use janitor::excel_numeric_to_date
to convert to correct date
2) To format date use lubridate
. In your case dmy
function would be it.
3) then convert the tibble
to xts
using tbl2xts
; vignette:
https://cran.r-project.org/web/packages/tbl2xts/vignettes/tbl2xts_vignette.html
Upvotes: 1