Reputation: 621
I want to see check the linear relatioship between all the pairs of variables in a dataset. Because i have 39 variables the scatterplot is not very helpful, so i decided to select a random sample of (20 variable) to check it but still chart is too big to interpret. Im using the following code
require("pairsD3")
sample_data <- data[ ,sample(ncol(data), 20)]
pairs(sample_data, pch=19)
and take the following result enter image description here
Is there any way to make the bullets small so i can see if linearity exists or another way to check linearity? Thank you in advance!
Upvotes: 2
Views: 64
Reputation: 72893
Just use cex=
option.
require("pairsD3")
sample_data <- volcano[ ,sample(ncol(volcano), 5)]
pairs(sample_data, pch=19, main="normal")
pairs(sample_data, pch=19, cex=.1, main="adjusted")
Upvotes: 1
Reputation: 37641
You might do better to try to get at linear relationships directly. That is what the correlation coefficient is for. There are good tools for visualizing the correlation matrix so that you can quickly scan for the relationships. I like corrplot
. Since you do not provide any data, I will illustrate with the Glass data.
library(corrplot)
library(mlbench) ## for Glass data
data(Glass)
corrplot(cor(Glass[,1:9]))
This only has 9 variables, but even at 39, you should find this readable. You can look at this and see right away that the strongest relationship is between RI and Ca. There is a pretty strong negative relationship between RI and Si. Once you know which ones are correlated, you can make the scatterplots for only the relevant variables and have more room to see the results.
plot(Glass[ ,c(1,5,7)], pch=16)
Upvotes: 1