Reputation: 2666
I am trying to read the following JSON database into R using the jsonlite
package.
library(jsonlite)
db <- fromJSON("http://www.stbates.org/funguild_db.php", flatten=TRUE)
Doing this throws the following error:
Error in parse_con(txt, bigint_as_char) :
lexical error: invalid char in json text.
<html> <head> <title>funguild_d
(right here) ------^
Clearly it does not like these characters. Is there a simple work around here I am missing?
Upvotes: 1
Views: 670
Reputation: 26258
@MrFlick is right in that it's not a good way to serve data. But as always, there's ways around it. Here I'm using rvest
to scrape the entire page, then gsub
to get rid of the first string, which happens to be the final part of the url (minus the .php extension).
url <- "http://www.stbates.org/funguild_db.php"
library(rvest)
library(jsonlite)
js <- url %>%
read_html() %>%
html_text()
js <- jsonlite::fromJSON(gsub("funguild_db", "", js))
head(js[, 1:5])
# $oid taxon taxonomicLevel trophicMode guild
# 1 58f450f1791497fd28ebfccc Xanthomonas campestris 20 Pathotroph Plant Pathogen
# 2 58f450f1791497fd28ebfccd Xanthomonas juglandis 20 Pathotroph Plant Pathogen
# 3 58f450f1791497fd28ebfcce Xanthoparmelia 13 Symbiotroph Lichenized
# 4 58f450f1791497fd28ebfccf Xanthopeltis 13 Symbiotroph Lichenized
# 5 58f450f1791497fd28ebfcd0 Xanthopsora 13 Symbiotroph Lichenized
# 6 58f450f1791497fd28ebfcd1 Xanthopsorella 13 Symbiotroph Lichenized
Upvotes: 2