Reputation: 1269
My final goal is to be able to use rselenium through rstudio my EC2 instance (AWS).
For that, I read that it is recommeneded to use and install a docker on a virtual machine. So I followed all the steps given by John D Harrison here: https://rpubs.com/johndharrison/RSelenium-Docker
And everything went fine except at the end when I enter in RStudio
on my EC2
instance.
Because when I try to connect to the remote server I get the below error:
library(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "192.168.99.100", port = 4445L)
remDr$open()
1 "Connecting to remote server" Error in checkError(res) : Undefined error in httr call. httr output: Timeout was reached: Connection timed out after 10001 milliseconds
I followed the exact steps given in the tutorial so I really don't know what is wrong.
Any help much appreciated !
M.
EDIT1:
Please find below screenshots of what I have so far:
My EC2 instance I use is the following:
Upvotes: 2
Views: 572
Reputation: 1269
Thanks @awchisholm ! As you explaied, I needed to install docker on my EC2 instance and not on my local machine. Problem solved!
Upvotes: 0
Reputation: 6567
The following works for me.
Create an EC2 Ubuntu instance with Docker installed. I didn't use Windows.
Run the selenium docker image on the EC2 instance as follows
docker run -d -p 4445:4444 selenium/standalone-firefox:2.53.0
Ensure port 4445 is open from the IP address where you are running R by creating the appropriate entry in the security group.
On my desktop machine that can see the EC2 instance, use this R code to connect...
library(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "ec2-xxx.eu-west-1.compute.amazonaws.com", port = 4445L)
remDr$open()
#[1] "Connecting to remote server"
#$applicationCacheEnabled
#[1] TRUE
#$rotatable
#[1] FALSE
#$handlesAlerts
#[1] TRUE
#...
Note the address of the EC2 instance is "ec2...". This address is available from the AWS console and is the public DNS name of the instance. If you happen to be running R on another AWS machine, then you would probably need to use the private DNS address.
[Edited to add instructions for running Rstudio in the cloud]
Find the IP address of the Selenium container. One way is to log into it as follows
docker exec -it <nameofthecontainer> bash
hostname -i
exit
To run Rstudio on the same EC2 machine as Selenium, one option is to use Docker. A good image is rocker/rstudio
.
Do the following.
docker run -d -p 8787:8787 -e PASSWORD=<password> --name rstudio rocker/rstudio
Make sure port 8787 is open to you from where you want to access Rstudio. Add entries in the security group for the instance to do this.
To install RSelenium in the Rstudio docker container, do the following.
docker exec rstudio bash
apt-get update
apt-get install -y libxml2-dev
exit
Find the URL of the Rstudio GUI - it will be something like this
http://ec2-xxx:8787
The username is rstudio and the password is whatever you specified when you started the container.
Install the RSelenium package from Rstudio.
install.packages("RSelenium")
Finally run the R code to access the Selenium instance.
library(RSelenium)
remDr <- remoteDriver(remoteServerAddr = "IP address of the Selenium container", port = 4445L)
remDr$open()
Upvotes: 2