user2763361
user2763361

Reputation: 3919

Plot gigantic correlation matrix as colours

I have a correlation matrix $P_{i,j}$ which is $1000 \times 1000$. Given the data the matrix will have rectangular patches of very high correlations. That is, if you draw a $20 \times 20$ square anywhere in this matrix you will either be looking at a patch of highly correlated variables ($\rho_{i,j}> 0.8$) or medium to uncorrelated ($\in [-0.1, 0.5]$). The reason for this is the structure of the data.

How do I represent this graphically? I know of one way to visualize a matrix like this but it only works for small dimensions:

install.packages("plotrix")
library(plotrix)
rhoMat = array(rnorm(1000*1000),dim=c(1000,1000))
color2D.matplot(rhoMat[1:10,1:10],cs1=c(0,0.01),cs2=c(0,0),cs3=c(0,0)) #nice!
color2D.matplot(rhoMat,cs1=c(0,0.01),cs2=c(0,0),cs3=c(0,0)) #broken!

What is a function or algorithm that would plot a red area if in that vicinity in the matrix $P_{i,j}$, correlations "tend to" be high, versus "tending" to be low (even better if it switches from one colour to another as we move from positive to negative correlation patches). I want something to see how many patches of high correlations there are and whether one patch is correlated to another patch at a different place in the dataset.

I only want to do it in R.

Upvotes: 1

Views: 920

Answers (3)

Greg Snow
Greg Snow

Reputation: 49640

Look at the corrplot package. It has various tools for visualizing correlations, one option that it has is to use hierarchical clustering to draw rectangles around groups of high or low correlation.

Upvotes: 1

Bryan Hanson
Bryan Hanson

Reputation: 6213

I think you can use image with the argument breaks to get exactly what you want:

dat <- matrix(runif(10000), ncol = 100)
image(dat, breaks = c(0.0, 0.8, 1.0), col = c("yellow", "red"))

I always fail to think of image for this kind of problem - the name is sort of non-obvious. I started with heatmap and then it led me to image.

Upvotes: 2

Behacad
Behacad

Reputation: 99

I've done this in Excel fairly easily. You can change the colour of boxes based on range of values within the boxes. You can even create a gradient from lets say 0 to 1. 1000 x 1000 would be big for Excel, but I think it would work. You would just have to zoom out.

Upvotes: -2

Related Questions