CoolGuyHasChillDay
CoolGuyHasChillDay

Reputation: 747

RSelenium: Can't see Browser as I run my Code

MacOS Sierra 10.12.4. Chrome 63 (most recent). R 1.1.383.

I'm using RSelenium to scrape web data. I'm able to pull data using the remote driver, but the actual web page browser doesn't pop up for me to view. This makes it difficult to debug some of my trickier web pulls. This is an example video of what I want to happen. The user can visually see the changes he's making in the browser- The goal of this post is to find out why I cannot visually see the browser as I run the code.

Here's an example of my process to pull from RSelenium.

From the Terminal:

(name)$ docker run -d -p 4567:4444 selenium/standalone-chrome (name)$ docker ps

output:

CONTAINER ID        IMAGE                        COMMAND                  CREATED             STATUS              PORTS                    NAMES
8de3a1cbc777        selenium/standalone-chrome   "/opt/bin/entry_po..."   5 minutes ago       Up 5 minutes        0.0.0.0:4567->4444/tcp   wizardly_einstein

In R

library(RSelenium)
library(magrittr)
library(stringr)
library(stringi)
library(XML)

remDr <- rsDriver(port = 4567L, browser = "chrome")
remDr$client$open()

remDr$client$navigate("https://shiny.rstudio.com/gallery/datatables-options.html")
webElems <- remDr$client$findElements("css selector", "iframe")
remDr$client$switchToFrame(webElems[[1]])
elems <- remDr$client$findElements("css selector", "#showcase-app-container > nav > div > ul li")

unlist(lapply(elems, function(x) x$getElementText()))
[1] "Display length"    "Length menu"       "No pagination"     "No filtering"      "Function callback"

This is my confirmation that RSelenium is working properly. However, this is all happening "blindly" - I can't see what is going on. In a complicated web pull I'm trying to perform (hidden behind credentials so I can't give an example), certain elements cannot be found after iterations even though I know they are on the page. Being able to see the browser would allow me to easily debug the code.

Not sure if this means anything, but it doesn't look like the driver is attached to an IP address:

(name)$ docker-machine ip
Error: No machine name(s) specified and no "default" machine exists

Is there something else I need to download to be able to visually see the webdriving process? Thanks in advance.

Upvotes: 5

Views: 1474

Answers (1)

IanK
IanK

Reputation: 386

I'm not sure about the exact behavior in that video, but I always use a phantomjs headless browser and look at screenshots as I go. This code would produce what I'm talking about:

library(RSelenium)

#this sets up the phantomjs driver
pjs <- wdman::phantomjs()

#open a connection to it
dr <- rsDriver(browser = 'phantomjs')
remdr <- dr[['client']]

#go to the site
remdr$navigate("https://stackoverflow.com/")

#show browser screenshot in viewer
remdr$screenshot(TRUE)

Upvotes: 2

Related Questions