Reputation: 1
I am trying to do a PCA of monthly temperatures, but I am given a dataset that has more columns than just the monthly data. How do I only read in the month columns to perform the PCA? Here is everything I have so far:
dat_TEMP=read.table("TEMPERATURE.csv",header=TRUE, sep=";", dec=",",row.names=1)
attach(dat_TEMP)
df=data.frame(January,February,March,April,May,June,July,August,September,October,November,December)
dat.pca=prcomp(df,dat_TEMP,center=T,scale=T)
but when I try to run that last line it gives me this error: "Error in colMeans(x, na.rm = TRUE) : 'x' must be numeric"
Can anyone help me with this? What do I need to do to just read out the month columns?
Upvotes: 0
Views: 1338
Reputation: 544
You need to make sure that in extraction your numeric columns arent passed as character or factors. If not , you can then subset the data with numeric columns and then run PCA.
There are multiple ways you can subset the data with only numeric columns .
library("dplyr")
data.numeric=select_if(data, is.numeric)
colnums <- sapply(data, is.numeric)
data[ , colnums]
Alternatively
data[, sapply(data, class) == "numeric"]
Upvotes: 2