Reputation: 1217
I have some troubles to scrape the text information from this webpage: http://www.iplant.cn/info/Acer%20stachyophyllum?t=foc
What I need is the text information in the center of this webpage: "Trees to 15 m tall, dioecious. ..." I tried to use the read_html function in R package rvest, but got nothing. Could anyone help me with that? Thanks so much.
Upvotes: 0
Views: 53
Reputation: 174586
This part of the page is generated from an xhr call. You can get the specific piece of text you are looking for from any species by doing:
get_description <- function(species_name)
{
url <- "http://www.iplant.cn/ashx/getfoc.ashx"
query <- paste0("?key=", gsub(" ", "+", species_name),
"&key_no=&m=", runif(1), 9)
jsonlite::fromJSON(paste0(url, query))$Description
}
So for example:
get_description("Actaea asiatica")
#> [1] "<p>Rhizome black-brown, with numerous slender fibrous roots.
#> Stems 30--80 cm tall, terete, 4--6(--9) mm in diam., unbranched,
#> basally glabrous, apically white pubescent. Leaves 2 or 3, proximal
#> cauline leaves 3 × ternately pinnate ...<truncated>
get_description("Acer stachyophyllum")
# > [1] "<p>Trees to 15 m tall, dioecious. Bark yellowish brown, smooth.
#> Branchlets glabrous. Leaves deciduous; petiole 2.5-8 cm, slightly
#> pubescent near apex; leaf blade ovate or oblong, 5-11 × 2.5-6 cm,
#> undivided or 3-lobed, papery, abaxially densely pale or white pubescent,
#> becoming less so when mature or nearly glabrous, adaxially glabrous,
#> 3-5-veined at base abaxially, rarely with rudimentary...<truncated>
Upvotes: 1