Omar
Omar

Reputation: 57

Correlation matrix produces 1s in diagonal and NA for the rest

I have a dataframe (Compiled_data) which has 7 columns of numeric data. I wanted to find the correlation between the different columns of data using the cor() function. The code returns a correlation matrix that has 1 in the diagonal while the remaining positions in the correlation matrix are NA.

Column_headers <- c("Country", "Country_code", "Year", "Death.rate", 
                             "Fertility.rate", "Greenhouse.gas", "Mobile.subs",
                             "Permanent_cropland","Population.density", 
                             "Birth.rate")

I want to explore the interaction between the data in columns "Death.rate" to "Birth.rate"

Death.rate <- c(19.262,19.321,19.120,18.652)
Fertility.rate <- c(6.942,6.928,6.904,6.869)
Greenhouse.gas <- c(107540.6,109807.3,111165.3,110459.4)
Mobile.subs <- c(NA,4,0,0,0)
Permanent.cropland <- c(1.982024,1.982024,1.982024,1.982024)
Population.density <- c(503.4312,511.8361,519.6092,528.0958)
Birth.rate <- c(46.879,46.511,46.117,45.704)

I would also like to exclude NAs and 0s from being considered in the calculation. Any help would be great!

Upvotes: 0

Views: 308

Answers (2)

Omar
Omar

Reputation: 57

Thanks everyone for the feedback. The following code worked for this:

cordata <- Compiled_dataset[,c(4:10)]
corr <- cor(cordata, use = "pairwise", method = "spearman")

Upvotes: 0

Carey Caginalp
Carey Caginalp

Reputation: 432

Like Ronak mentioned, you probably have nulls in the data which is interfering with the computation of correlation. You will need to use something for the "use" argument in your correlation function, i.e. "pairwise.complete.obs" to only compare observations where there is data in both. If you want to remove 0s as well, you might want to coerce them to NAs before running the correlation function.

Upvotes: 1

Related Questions