nickolakis
nickolakis

Reputation: 621

Make boxes contect clearer to interpret

I want to see check the linear relatioship between all the pairs of variables in a dataset. Because i have 39 variables the scatterplot is not very helpful, so i decided to select a random sample of (20 variable) to check it but still chart is too big to interpret. Im using the following code

require("pairsD3")
sample_data <- data[ ,sample(ncol(data), 20)]

pairs(sample_data, pch=19)

and take the following result enter image description here

enter image description here

Is there any way to make the bullets small so i can see if linearity exists or another way to check linearity? Thank you in advance!

Upvotes: 2

Views: 64

Answers (2)

jay.sf
jay.sf

Reputation: 72893

Just use cex= option.

require("pairsD3")
sample_data <- volcano[ ,sample(ncol(volcano), 5)]

pairs(sample_data, pch=19, main="normal")
pairs(sample_data, pch=19, cex=.1, main="adjusted")

enter image description here

enter image description here

Upvotes: 1

G5W
G5W

Reputation: 37641

You might do better to try to get at linear relationships directly. That is what the correlation coefficient is for. There are good tools for visualizing the correlation matrix so that you can quickly scan for the relationships. I like corrplot. Since you do not provide any data, I will illustrate with the Glass data.

library(corrplot)
library(mlbench)    ## for Glass data
data(Glass)

corrplot(cor(Glass[,1:9]))

corrplot

This only has 9 variables, but even at 39, you should find this readable. You can look at this and see right away that the strongest relationship is between RI and Ca. There is a pretty strong negative relationship between RI and Si. Once you know which ones are correlated, you can make the scatterplots for only the relevant variables and have more room to see the results.

plot(Glass[ ,c(1,5,7)], pch=16)

Scatterplot of correlated variables

Upvotes: 1

Related Questions