Reputation: 13
I am plotting a surface in R markdown using
p1 <- plot_ly() %>% add_surface(z=z,x=wRange,y=yRange) %>% layout(showlegend=FALSE,scene=list(xaxis=list(title="wMult"),yaxis=list(title="yMult"),zaxis=list(title="MAE")))
p1
Later on I want to add a point to this surface by doing this:
p2 <- p1 %>% add_markers(z=MAE1,x=wMult1,y=yMult1) %>% layout(showlegend=FALSE)
p2
A little after that I try to chart a p3 by adding another mark on top of p2.
p3 <- p2 %>% add_markers(z=MAE2,x=wMult2,y=yMult2) %>% layout(showlegend=FALSE)
p3
Unfortunately only p1 renders as an interactive chart in the HTML. p2 and p3 appear as an empty white space that's roughly the size a chart should be, but with nothing inside, both in the viewer and in browser. If I use the web inspector I can see that it's trying to render a plotly object, but it appears empty.
If I run the same code directly in RStudio, I can view the plots with the extra markers added, but they don't render when I knit the markdown.
What is going on here?
The dataset is available here: https://archive.ics.uci.edu/ml/datasets/auto+mpg
Here is the full markdown code so far:
---
title: "Gradient Descent Demo"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
setwd("[your working directory]")
data = read.csv("mpg.csv")
require(plotly)
```
## Our Dataset
Let's take a look at some sample data. It shows attributes of several old cars and their fuel economy measured in miles per gallon.
```{r c1}
head(data)
```
We'll try to predict a given car's mpg using only its weight and year.
```{r c2}
data <- data[,c("mpg","weight","year")]
pairs(data)
```
## Create a Hypothesis
Our hypothesis will be that we can get an approximation of the mpg by multipling the car's weight by some number "wMult" and adding that to the car's year multiplied by some other number "yMult". Let's just pick some numbers to start.
```{r c3, include=FALSE}
mod1 <- lm(mpg~weight+year,data=data)
bias1 <- mod1$coefficients[1]
```
```{r}
wMult1 <- -.02
yMult1 <- -2
```
We can turn this into a prediction.
(Ignore the bias - I cheated and did some behind-the-scenes pre-work.)
```{r c4}
data$mpgPred1 <- wMult1*data$weight + yMult1*data$year + bias1
head(data)
```
Ok so we have predictions. They're clearly pretty bad since they're negative, and most cars don't get negative miles per gallon. But can we measure how bad they are?
## Evaluate the Hypothesis
We need some measure of how good (or bad) our prediction is. We'll use the Mean Absolute Error ("MAE"). As the name suggests, this is calculated finding the average of the absolute difference between each predicted value and actual value.
```{r c5}
MAE1 <- mean(abs(data$mpgPred1-data$mpg))
MAE1
```
Ok so on average we're only off by about 250 mpg. Surely we can do better.
## Adjust the Hypothesis
What to use for our next hypothesis? Well we assign new wMult and yMult values and see how we do.
```{R c6}
wMult2 <- wMult1 + .03
yMult2 <- wMult2 - 1.2
data$mpgPred2 <- wMult2*data$weight + yMult2*data$year + bias1
head(data)
```
Our predictions look better (At least they're positive!), but they're still pretty far off. Let's see how much better or worse they are.
## Evaluate the Hypothesis - Round 2
```{R c7}
MAE2 <- mean(abs(data$mpgPred2-data$mpg))
MAE1
MAE2
```
Now we're only off by 50 on average. Still pretty terrible, but better than before.
## Adjust the Hypothesis - There has to be a better way.
Ok so instead of just continuing to make random guesses, let's develop a way to intelligently update our hypothesis.
Thankfully, since we're only using two variables for our analysis, we can pretty easily visualize the effect of every reasonable combination of wMult and yMult.
```{R c8, include=FALSE}
plotdata <- data.frame(wCoef=double(),yCoef=double(),MAE=double())
wRange <- seq(mod1$coefficients[2]-300*summary(mod1)$coefficients["weight","Std. Error"],mod1$coefficients[2]+300*summary(mod1)$coefficients["weight","Std. Error"],length.out=201)
yRange <- seq(mod1$coefficients[3]-300*summary(mod1)$coefficients["year","Std. Error"],mod1$coefficients[3]+300*summary(mod1)$coefficients["year","Std. Error"],length.out=201)
for(i in wRange)
{for(j in yRange)
{
preds <- (i*data$weight) + (j*data$year) + bias1
resid <- preds-data$mpg
MAE = mean(abs(resid))
newRec <- data.frame(wCoef=i,yCoef=j,MAE=MAE)
plotdata <- rbind(plotdata,newRec)
}
}
z <- matrix(plotdata$MAE,nrow=201,ncol=201)
```
```{R c9}
p1 <- plot_ly() %>% add_surface(z=z,x=wRange,y=yRange) %>% layout(showlegend=FALSE,scene=list(xaxis=list(title="wMult"),yaxis=list(title="yMult"),zaxis=list(title="MAE")))
p1
```
Great - we can visibly explore this graph and see what some good weights might be. The best one is the one that minimizes the MAE. That's the center spot at the middle of the valley, where the crease seems to dip slightly.
Let's add our first hypothesis to this chart to see where it falls.
```{R c10,warning=F}
p2 <- p1 %>% add_markers(z=MAE1,x=wMult1,y=yMult1) %>% layout(showlegend=FALSE)
p2
```
And let's add our second one
```{R c11}
p3 <- p2 %>% add_markers(z=MAE2,x=wMult2,y=yMult2) %>% layout(showlegend=FALSE)
p3
```
Ok so it turns out our second guess actually overshot. This means that if we kept updating our hypothesis in the same manner, we'd actually get worse with each new step.
## Letting the machine do it
As I mentioned before, this approach works because we only have 2 variables we're working with. But if we had more, we'd be dealing with spaces greater than 3 dimensions. This gets hard to visualize.
Thankfully there's a way for the machine to navigate those higher dimensional spaces. We'll continue to use this two dimensional approach for now to help illustrate the approach.
Upvotes: 1
Views: 2471
Reputation: 2695
In HTML land, attributes like id
are important. What's happening is that the plot's div
id is inherited from the earlier plot. This isn't allowed in HTML. So, you'll need to recreate the plot each time so that they don't inherit the plot id. I can't find a plotly
function to reset the id to prevent this problem, so my answer is to follow a strict 'does not inherit previous plot' policy:
---
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(plotly)
```
# First plot
```{r pressure, echo=FALSE}
p1 <- plot_ly(source = "plot1") %>%
add_markers(x = pressure$temperature, y = pressure$pressure)
p1
```
# Second plot
```{r pressure2, echo= FALSE}
p2 <- plot_ly(source = "plot2") %>%
add_markers(x = pressure$temperature, y = pressure$pressure) %>%
add_markers(x = pressure$temperature, y = pressure$pressure+10)
p2
```
Sidenote: if you were in a Shiny environment, each plot would get wrapped in a render/output combo that requires unique naming.
Upvotes: 2