Cecile Tobin
Cecile Tobin

Reputation: 1

How to scrape data from a web graph into R?

The website TRAC Immigration has data on the number of ICE deportations by month and year for each city in Texas. I would like to download this data into R, but there is not a data file available. I think this means I need to scrape the data, but I don't know how to do so. Here is the website: TRAC Immigration

There is a table for each city that displays the total number of deportations over the 19 year period but not by month and year.

Table with number of deportation by county

However, there is a graph for each city that displays the number of deportations by month and year. This information is only displayed when you hover your cursor over each bar of the graph.

Graph with number of deportations by county by month and year

Please let me know if you have any ideas about how I could scrape the data from the graph for each city into R. I would eventually like to have the number of deportations be a variable in a dataset.

Upvotes: 0

Views: 123

Answers (1)

DaveArmstrong
DaveArmstrong

Reputation: 22034

@Dave2e did the hard work, but here's a way of using what he found to get the different cities. You could replace depart_state with depart_city. Now, you don't know which cities are which, so you can use some brute force to get all of them. I was able to get the data for 397 cities in a few minutes:

out <- NULL
for(i in 1:397){
url <- glue::glue("https://trac.syr.edu/phptools/immigration/remove/graph.php?stat=count&timescale=fymon&depart_city={i}&timeunit=number")
j <- jsonlite::fromJSON(url)
tm <- j$timeline
tm$city <- j$title
out <- rbind(out, tm)
}
out %>% dplyr::filter(city == "LAREDO, TX, POE")

Upvotes: 2

Related Questions