user101089
user101089

Reputation: 3992

Interpretation of species vectors in nMDS

I'm trying to compare results I get from a PCA biplot with what I can get from MASS:isoMDS() and vegan::metaMDS() particularly using vectors representing the variables to help interpret the MDS dimensions.

My question is essentially how I can understand what is returned by the $species component of metaMDS(). How is it calculated and what do the values represent? The terminology differs: sites x species seems to correspond to cases x variables.

For the heplots::Diabetes data, the PCA is given by

data(Diabetes, package="heplots")

diab.pca <- 
  Diabetes |> 
  dplyr::select(where(is.numeric)) |>
  prcomp(scale. = TRUE)

and a fancy biplot (using the ggbiplot package) is shown below. The variable vectors are essentially the correlations of the observed variables with the dimensions.

enter image description here

Similarly, with MASS::isoMDS() I can plot the solution, and then embed variable vectors, by explicitly calculating those correlations.

diab.dist <- dist(Diabetes[, 1:5])
mds <- diab.dist |>
  MASS::isoMDS(k = 2, trace = FALSE) |>
  purrr::pluck("points") 
colnames(mds) <- c("Dim1", "Dim2")

library(ggpubr)
cols <- scales::hue_pal()(3) |> rev()
mplot <-
ggscatter(mds, x = "Dim1", y = "Dim2", 
          color = "group",
          shape = "group",
          palette = cols,
          size = 2,
          ellipse = TRUE,
          ellipse.level = 0.5,
          ellipse.type = "t") +
  geom_hline(yintercept = 0, color = "gray") +
  geom_vline(xintercept = 0, color = "gray") 

vectors <- cor(Diabetes[, 1:5], mds[, 1:2])
scale_fac <- 500
mplot + 
  coord_fixed() +
  geom_segment(data=vectors,
               aes(x=0, xend=scale_fac*Dim1, y=0, yend=scale_fac*Dim2),
               arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
               linewidth = 1.1) +
  geom_text(data = vectors,
            aes(x = 1.15*scale_fac*Dim1, y = 1.07*scale_fac*Dim2, 
                label=row.names(vectors)),
            nudge_x = 4,
            size = 4) +
  theme(legend.position = "inside",
        legend.position.inside = c(.8, .8))

enter image description here

With vegan::metaMDS(), I thought that the species component would give me something comparable t0 the variable vectors, and it mostly does, except for the vector for relwt.

Here's what I do to try to create a metaMDS() version.

library(vegan)
diab.mds <- metaMDS(diab.dist, distance="euclidean", k=2, trace = 0)

# get coordinates of points & vectors
points <- data.frame(diab.mds$points, group = Diabetes$group)
vectors <- data.frame(diab.mds$species)

cols <- scales::hue_pal()(3) |> rev()
ggplot(data=points, aes(x = MDS1, y = MDS2, 
                        color = group, shape = group)) +
  geom_point(size = 2) +
  stat_ellipse(level = 0.68, linewidth=1.1) +
  geom_hline(yintercept = 0) +
  geom_vline(xintercept = 0) +
  scale_color_manual(values = cols) +
  geom_segment(data = vectors, aes(x=0, xend=MDS1, y=0, yend=MDS2,),
               arrow = arrow(length = unit(0.2, "cm"), type = "closed"),
               colour="black",
               linewidth = 1.2,
               inherit.aes = FALSE) +
  geom_text(data=vectors, aes(x=MDS1, y=MDS2, label=rownames(vectors)), 
            size=5,
            vjust = "outward",
            inherit.aes = FALSE) +
  coord_cartesian(clip = "off") +
  theme_bw(base_size = 14) +
  theme(legend.position = "inside",
        legend.position.inside = c(.8, .8))

What I get from this is below. As you can see, the configuration of points & ellipses are similar across these methods. The variable vectors are also similar, except that the one for relwt points in the opposite direction from that in the PCA & isoMDS

enter image description here

Edit My problem arose from not understanding how to translate vegan terminology of sites and species into what I'm familiar with, a data set of observations x variables. The solution here is to use envfit(). This gives me what I want.

(fit <- envfit(diab.mds, Diabetes[, 1:5]))

plot(diab.mds)
plot(fit)
points(points[, 1:2], col = Diabetes$group)
car::dataEllipse(MDS2 ~ MDS1 | group, data=points, 
                 col = cols,
                 levels = 0.5, add=TRUE)

Upvotes: 0

Views: 221

Answers (0)

Related Questions