Isaac
Isaac

Reputation: 100

R: Highlight cells in the cor() table that have a correlation coefficient greater than a threshold

The cor() function R, when called on a dataframe, returns a matrix containing the correlation coefficients associated with pairwise elements fro the dataframe. But there doesn't seem to be any option to mark the coefficients that have values above some threshold (like STATA's *)

Is there any indirect way to get R to do this?

For example,

M = matrix(rnorm(20*5, mean = 10, sd = 3), 20, 5)
symnum(cor(M), cutpoints =  c(0.1, 0.5),
    symbols = c( '', '*', '**'),
    legend = TRUE,
     corr = TRUE)

returns a matrix devoid of the correlation coefficients; '', '*' or '**' have replaced the values. I'd like to generate a table that contains the correlation coefficients, and at the same time display a '*' in the cell if the coefficient value is greater than 0.1, and display '**' if the coefficient value is greater than 0.5

Upvotes: 1

Views: 2149

Answers (2)

MrGumble
MrGumble

Reputation: 5766

symnum returns a matrix of same dimensions as the correlation matrix, co in this example. This piece of code does 3 things, calculate the correlation matrix and rounds it of to 2 digits. Then uses paste to concatenate the numbers and the significance returned from symnum. Just one issue: paste reduces the matrix to a vector, so we have to re-setup the matrix form. Luckily, both matrix and paste uses column-order, i.e. the elements are ordered per column.

co <- cor(M)
co <- round(co, 2)
co[upper.tri(co, diag=TRUE)] <- ''
s <- symnum(co)
noquote(matrix(paste(co, s), ncol=ncol(co)))
attr(s, 'legend')

Upvotes: 1

G5W
G5W

Reputation: 37641

One option might be the corrplot package.

corrplot(cor(M), method='number')

Corrplot

Upvotes: 3

Related Questions