stats_noob
stats_noob

Reputation: 5935

R: Plot Axis Display Values Larger than the Original Data

I am using the R programming language. I am following a tutorial on data visualization over here: https://plotly.com/r/3d-surface-plots/

I created my own data and made a 3D plot:

library(plotly)

set.seed(123)

#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)

#3d plot
fig <- plot_ly(z = ~as.matrix(d))
fig <- fig %>% add_surface()

#view plot
fig

enter image description here

As seen here, there is a point on this 3D plot where "y = 97". I am not sure how this is possible, seeing how none of the values within the original data frame "d" are anywhere close to 97. I made sure of this by looking at the individual distributions of each variable in the original data frame "d":

#plot individual densities 

plot(density(d$a), main = "density plots", col = "red")
lines(density(d$b), col = "blue")
lines(density(d$c), col = "green")

legend( "topleft", c("a", "b", "c"), 
text.col=c("red", "blue", "green") )

enter image description here

As seen here, none of the variables (a,b,c) from the original data frame "d" have any values that are close to 97.

Thus, my question: can someone please explain how is it possible that the point (x = 0 , y = 97, z =25.326) appears on this 3D plot?

Thanks

Upvotes: 0

Views: 956

Answers (3)

Peter_293
Peter_293

Reputation: 1

As Robbie mentioned, its to do with how your data is organised. To change XYZ data to the same format as the volcano dataset, you can use the following from the raster package:

raster <- rasterFromXYZ(d)

# plot raster
plot_ly(z = as.matrix(raster), type = "surface")

Upvotes: 0

Robbie
Robbie

Reputation: 161

The problem is how you have your matrix built. Basically, the z-values (in your case the c variable) should be given in a matrix in which the rows and columns are like coordinates for a surface, similar to a grid or raster dataset. The values you see now along the x and y-axis are not the values from your a and b variables but the row and column numbers from your matrix (similar to coordinates). You can open the volcano dataset in R and have a look at how these data are organized, which will surely give you a better understanding of what I am trying to explain.

Upvotes: 0

stats_noob
stats_noob

Reputation: 5935

I am not sure if this will resolve the problem - but using the same logic from this previous stackoverflow post: 3D Surface with Plot_ly in r, with x,y,z coordinates

library(plotly)
set.seed(123)

#generate data
a = rnorm(100,10,10)
b = rnorm(100,5,5)
c = rnorm(100,5,10)
d = data.frame(a,b,c)


data = d

plot_ly() %>% 
  add_trace(data = data,  x=data$a, y=data$b, z=data$c, type="mesh3d" ) 

enter image description here

Now, it appears that all values seen in this visual plot are contained in the original data frame.

However, I am still not sure what is the fundamental (and mathematical) difference between both of these plots:

enter image description here

I am curious to see what others have to say.

Thanks

Upvotes: 1

Related Questions