stakowerflol
stakowerflol

Reputation: 1079

How parse json object to POSIXct format with jsonlite?

I want to tell json parser that one of the columns should be POSIXct. How to do this, when the parser loses time information.

    library(rvest)
    library(jsonlite)

      url = "http://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/5081/FUT?tradeDate=04/06/2018&strategy=DEFAULT&pageSize=500"
      strJson <- html_text(read_html(url))

      json <- map(strJson, ~fromJSON(.))
      mData <- as.matrix(as.data.frame(json))
      dat <- as.POSIXct(mData[,10], format='%H:%M:%S %d %b %Y')

All dat is NA

Upvotes: 0

Views: 109

Answers (3)

Martin Schmelzer
Martin Schmelzer

Reputation: 23889

The problem might be the locale LC_TIME. Mine was set to de_DE.UTF-8 and therefore the strings could not be parsed because of locale specific options like %p. Check your locale with Sys.getlocale("LC_TIME").

Changing it to an English one does do the job:

library(rvest)
library(jsonlite)

url = "http://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/5081/FUT?tradeDate=04/06/2018&strategy=DEFAULT&pageSize=500"
strJson <- html_text(read_html(url))

json  <- map(strJson, ~fromJSON(.))
mData <- as.data.frame(json, stringsAsFactors = F)

Sys.setlocale("LC_TIME", "en_GB.UTF-8")
dat  <- as.POSIXct(mData$updateTime, format='%A, %d %b %Y %I:%M %p')



head(dat)
[1] "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST"
[5] "2018-04-06 22:33:00 CEST" "2018-04-06 22:33:00 CEST"

Upvotes: 1

hrbrmstr
hrbrmstr

Reputation: 78792

NOT AN ANSWER since Martin has a great+succinct one for the specific quetion (but pedants are invited to downvote if it makes you feel better abt yourselves)

just a suggestion for an alternate data acquisition and conversion workflow:

src_url <- "http://www.cmegroup.com/CmeWS/mvc/Settlements/Futures/Settlements/5081/FUT?tradeDate=04/06/2018&strategy=DEFAULT&pageSize=500"

x <- jsonlite::fromJSON(src_url)

x$settlements <- list(x$settlements)

tidyr::unnest(tibble::as_data_frame(x))
## # A tibble: 58 x 14
##    updateTime  dsHeader reportType tradeDate empty month open  high  low   last  change settle volume openInterest
##    <chr>       <chr>    <chr>      <chr>     <lgl> <chr> <chr> <chr> <chr> <chr> <chr>  <chr>  <chr>  <chr>       
##  1 Friday, 06… Dated B… Final      04/06/20… FALSE APR … -     -     -     -     -1.02  66.81  0      75          
##  2 Friday, 06… Dated B… Final      04/06/20… FALSE MAY … -     -     -     -     -1.24  66.84  0      75          
##  3 Friday, 06… Dated B… Final      04/06/20… FALSE JUN … -     -     -     -     -1.24  66.37  0      75          
##  4 Friday, 06… Dated B… Final      04/06/20… FALSE JLY … -     -     -     -     -1.25  65.95  0      75          
##  5 Friday, 06… Dated B… Final      04/06/20… FALSE AUG … -     -     -     -     -1.24  65.56  0      0           
##  6 Friday, 06… Dated B… Final      04/06/20… FALSE SEP … -     -     -     -     -1.24  65.19  0      0           
##  7 Friday, 06… Dated B… Final      04/06/20… FALSE OCT … -     -     -     -     -1.25  64.78  0      0           
##  8 Friday, 06… Dated B… Final      04/06/20… FALSE NOV … -     -     -     -     -1.24  64.46  0      0           
##  9 Friday, 06… Dated B… Final      04/06/20… FALSE DEC … -     -     -     -     -1.22  64.15  0      0           
## 10 Friday, 06… Dated B… Final      04/06/20… FALSE JAN … -     -     -     -     -1.20  63.82  0      0           
## # ... with 48 more rows

Upvotes: 1

Shique
Shique

Reputation: 744

Make sure you put library(purrr) as well as map() is a function from that package. The column is imported as factor. It needs to change to character first.

cdat <- as.character(mData[,10]))

After that, the format you posted is not equal to the data format. You need to parse it exactly as the data says.

dat <- as.POSIXct(cdat, format = '%A, %d %b %Y %H:%M %p', tz = 'UTC')
# [1] "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC"
# [6] "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC" "2018-04-06 10:33:00 UTC"
#...

Or in one line:

dat <- as.POSIXct(as.character(mData[,10]), format = '%A, %d %b %Y %H:%M %p', tz = 'UTC')

Upvotes: 0

Related Questions