Reputation: 3275
I am trying to execute the code from here:
Change the class from factor to numeric of many columns in a data frame
in a dataframe with 140 columns
cols = c(1:140);
merged_dataset[,cols] = apply(merged_dataset[,cols], 2, function(x) as.numeric(as.character(x)));
the problem is for some columns I get NAs. Is there a way somehow exclude these columns from the code so that I keep the data and they don't get transformed into NAs? I see the type of these columns is character if that helps.
Upvotes: 0
Views: 679
Reputation: 520908
If you already know the indices of the columns you want to drop, then you may subset your data frame to target only certain columns:
cols <- c(1:140) # all columns
cols.skip <- c(1,3,5,21) # columns which CAN'T be converted to numeric
cols.keep <- cols[!cols %in% cols.skip]
merged_dataset[,cols.keep] <- apply(merged_dataset[,cols.keep], 2, function(x) {
as.numeric(as.character(x))
})
To implement similar logic using column names rather than indices:
cols.skip <- c("a", "b", "c")
cols.keep <- !(names(merged_dataset) %in% cols.skip)
merged_dataset[,cols.keep] <- apply(merged_dataset[,cols.keep], 2, function(x) {
as.numeric(as.character(x))
})
Upvotes: 1
Reputation: 5788
Substitution of any improper characters inside factor levels can also occur to better extract any numbers:
convert_factors_to_numeric <- function(df) {
as.data.frame(lapply(df,
function(x) {
if (is.factor(x)) {
as.numeric(as.character(trimws(x),
which = "both"))
} else{
x
}
}
),
stringsAsFactors = FALSE)
}
df_converted <- convert_factors_to_numeric(df)
Upvotes: 0