Antoni Parellada
Antoni Parellada

Reputation: 4791

Labeling points in R plot not Printing if Matrix instead of Data frame

OK... Silly simulated data... Names of subjects by initials, SAT scores, and income later in life in thousands of $'s. The entries are scaled and centered, and look like this:

names <- c("TK","AJ","CC", "ZX", "FF", "OK", "CT", "AF", "MF", "ED", "JV", "LK", "AS", "JS", "SR", "CF", "MH", "BM")
SAT <- c(1345, 1566, 1600, 1002, 1008, 999, 1599, 1488, 950, 1567, 1497, 1300, 1588, 1443, 1138, 1557, 1478, 1600)
income <- c(150e3, 250e3, 300e3, 100e3, 110e3, 199e3, 240e3, 255e3, 75e3, 299e3, 300e3, 125e3, 400e3, 120e3, 86e3, 225e3, 210e3, 60e3)

dat <- cbind(SAT, income)
row.names(dat) <- names
dat <- scale(dat, scale = T, center = T)

plot(income ~ SAT, col=as.factor(rownames(dat)), pch= 19, xlim = c(-2.5,2.5), ylim=c(-2.5,2.5), data = dat)
abline(v=0,h=0, col = "dark gray")
text(x=dat$SAT, y=dat$income, rownames(dat), pos=3, cex = 0.5)

...results in the correct plot, except for the absent labels. Here's the plot and the error message:

enter image description here Error in dat$SAT : $ operator is invalid for atomic vectors

Yet, as I found out right before doing something drastic :-) everything changes if and only if I do this tiny modification to the code before plotting:

dat <- as.data.frame(dat)

And, now...

dat <- cbind(SAT, income)
row.names(dat) <- names
dat <- scale(dat, scale = T, center = T)

dat <- as.data.frame(dat)
plot(income ~ SAT, col=as.factor(rownames(dat)), pch= 19, xlim = c(-2.5,2.5), ylim=c(-2.5,2.5), data = dat)
abline(v=0,h=0, col = "dark gray")
text(x=dat$SAT, y=dat$income, rownames(dat), pos=3, cex = 0.5)

enter image description here

So, I guess the issue is to always make sure that we are dealing with a data frame before labeling points (?). Is R so intrinsically unfriendly because it was created to get things done without commercial interest, or because it has so many layers underneath that makes it clunky? (I digress - don't mean to start a conversation...)

Upvotes: 3

Views: 225

Answers (1)

Forrest R. Stevens
Forrest R. Stevens

Reputation: 3485

In the future you should supply your data in case that might be the problem. But I don't think that's the case. You can accomplish your labeling with the simpler text() function after generating your plot:

names <- c(
  "TK","AJ","CC", "ZX", "FF", "OK", "CT", "AF", "MF", 
  "ED", "JV", "LK", "AS", "JS", "SR", "CF", "MH", "BM"
)
SAT <- c(
  1345, 1566, 1600, 1002, 1008, 999, 1599, 1488, 950, 
  1567, 1497, 1300, 1588, 1443, 1138, 1557, 1478, 1600
)
income <- c(150e3, 250e3, 300e3, 100e3, 110e3, 199e3, 240e3, 
  255e3, 75e3, 299e3, 300e3, 125e3, 400e3, 120e3, 86e3, 
  225e3, 210e3, 60e3
)

dat <- data.frame(SAT, income)
dat <- scale(dat, scale = T, center = T)

##  Note the conversion here back to a data.frame object, since scale()
##    converts to a matrix object:
dat <- as.data.frame(dat)
row.names(dat) <- names

plot(income ~ SAT, col=as.factor(rownames(dat)),
     pch= 19, xlim = c(-2.5,2.5), ylim=c(-2.5,2.5),
     data = dat)

##  Plot text labels above (pos=3) point locations:
text(x=dat$SAT, y=dat$income, row.names(dat), pos=3, cex=0.5) 

enter image description here

Upvotes: 4

Related Questions