user7307305
user7307305

Reputation:

removing specific columns in R

I am using findCorrelation function in R:

highCorr <- findCorrelation(correlations, cutoff = .60,names = FALSE)

The function return columns numbers/names that are 0.6 an above correlated.

I want to remove these columns.

I don't know how to do this because first if i remove them one at a time the column number change but, I want to try few cutoff threshold and would like to do this automatically.

Upvotes: 1

Views: 738

Answers (1)

Patrick Williams
Patrick Williams

Reputation: 704

If your original data are a correlation matrix you can do the following:

library(caret) #findCorrelation comes from this library
set.seed(1)

#create simulated data for correlation matrix
mydata <- matrix(data = rnorm(100,mean = 100, sd = 3), nrow = 10, ncol = 10)

#create correlation matrix
correlations <- cor(mydata)

#index correlations at cutoff
corr_ind <- findCorrelation(correlations, cutoff = .2)

#remove columns from original data based on index value
remove_corrs <- mydata[-c(corr_ind)]

Upvotes: 1

Related Questions