Reputation: 343
I have a data frame that I am converting into a tsibble
time series object to allow for easier timeseries graphing and manipulation (rolling time window analysis) of data. I obtain new data daily that I would like to append on to the original data frame represented as df
, new incoming data is represented as df2
. I can change these data.frame
's into a tsibble
objects independently, but when I use rbind()
to join them first and then use as_tsibble
, I get an error.
as_tsibble(final_df, index = date, key = ticker)
Error: A valid tsibble must have distinct rows identified by key and index.
i Please use duplicates() to check the duplicated rows.
To set up the problem here is the code for a reprex.
df <- data.frame(ticker = c("UST10Y", "UST2Y", "AAPL", "SPX", "BNO"),
buy_price = c(62.00, 68.00, 37.00, 55.00, 41.00),
sale_price = c(64.00, 71.00, 42.00, 60.00, 45.00),
close_price = c(63.00, 70.00, 38.00, 56.00, 43.00),
date = c(as.Date("April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021")))
df2 <- data.frame(ticker = c("UST10Y", "UST2Y", "AAPL", "SPX", "BNO"),
buy_price = c(63.00, 69.00, 38.00, 53.00, 44.00),
sale_price = c(66.00, 77.00, 47.00, 63.00, 48.00),
close_price = c(65.00, 74.00, 39.00, 55.00, 45.00),
date = c(as.Date("April 30th, 2021", "April 30th, 2021", "April 30th, 2021", "April 30th, 2021", "April 30th, 2021")))
final_df <- rbind(df,df2)
str(final_df)
> 'data.frame': 10 obs. of 5 variables:
as_tsibble(final_df, index = date, key = ticker)
Upon running the code as_tsibble(final_df, index = date, key = ticker)
, the order also is changed to be alphabetical, whereas I would like to preserve original order(another question).
I am unable to create a tsibble with final_df
, although a tsibble
can be created individually on df
and df2
.
Am I missing something or is it impossible to have a tsibble
object with multiple rows of the same ticker name?
Upvotes: 0
Views: 1111
Reputation: 2459
A tsibble must have a unique time point (the index
) for each observation in a time series, where each time series is identified by the key
.
The datasets that you have constructed for your MRE appear to have this quality, however the conversion to date is not giving you the desired results. For example, your index variable in df
is:
as.Date("April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021")
#> [1] "2021-05-06"
To correctly parse "April 29th, 2021" you could use the {lubridate}
package's mdy()
function:
lubridate::mdy("April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021")
#> [1] "2021-04-29" "2021-04-29" "2021-04-29" "2021-04-29" "2021-04-29"
Fixing the parsing of dates, the issue is resolved and we are able to create the tsibble.
library(tsibble)
library(lubridate)
df <- data.frame(ticker = c("UST10Y", "UST2Y", "AAPL", "SPX", "BNO"),
buy_price = c(62.00, 68.00, 37.00, 55.00, 41.00),
sale_price = c(64.00, 71.00, 42.00, 60.00, 45.00),
close_price = c(63.00, 70.00, 38.00, 56.00, 43.00),
date = mdy(c("April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021", "April 29th, 2021")))
df2 <- data.frame(ticker = c("UST10Y", "UST2Y", "AAPL", "SPX", "BNO"),
buy_price = c(63.00, 69.00, 38.00, 53.00, 44.00),
sale_price = c(66.00, 77.00, 47.00, 63.00, 48.00),
close_price = c(65.00, 74.00, 39.00, 55.00, 45.00),
date = mdy(c("April 30th, 2021", "April 30th, 2021", "April 30th, 2021", "April 30th, 2021", "April 30th, 2021")))
final_df <- rbind(df,df2)
as_tsibble(final_df, index = date, key = ticker)
#> # A tsibble: 10 x 5 [1D]
#> # Key: ticker [5]
#> ticker buy_price sale_price close_price date
#> <chr> <dbl> <dbl> <dbl> <date>
#> 1 AAPL 37 42 38 2021-04-29
#> 2 AAPL 38 47 39 2021-04-30
#> 3 BNO 41 45 43 2021-04-29
#> 4 BNO 44 48 45 2021-04-30
#> 5 SPX 55 60 56 2021-04-29
#> 6 SPX 53 63 55 2021-04-30
#> 7 UST10Y 62 64 63 2021-04-29
#> 8 UST10Y 63 66 65 2021-04-30
#> 9 UST2Y 68 71 70 2021-04-29
#> 10 UST2Y 69 77 74 2021-04-30
Created on 2021-05-06 by the reprex package (v1.0.0)
Upvotes: 1