Reputation: 227
I have over 500 factor columns in my dataframe many of which are only "True"/"False". Is there any way to remove quotes for just these columns in one shot?
Example code --
sample=as.list(dataframe[1,])
for(i in 1:length(sample)){
if(sample[i]=="false") sample[i]=false
}
The above code doesn't seem to work. Any leads appreciated!
Upvotes: 0
Views: 257
Reputation: 6921
This solves your problem:
> as.logical(c("true", "false", "True", "TRUE", "False"))
[1] TRUE FALSE TRUE TRUE FALSE
I was surprised too.
EDIT: I just noticed your code and I figured you could use a complete example.
Your data is in a data.frame (which is basically a list of columns). This is similar to a spreadsheet if you will.
Doing dataframe[1,]
extracts the first line of your dataset. I guess what you want is rather to get the first column with dataframe[,1]
. This column is a vector, which is good to operate on, no need to put it in a list.
So you would do:
as.logical(dataframe[,1])
But that would only return the data you want, not modify the dataframe! So you want to assign this result to the first column:
dataframe[,1] <- as.logical(dataframe[,1])
There you go, the first column no longer contains strings but logicals, no matter what the capitalization was.
If by any chance you actually meant to work on the row, this is unusual and likely means that you should transpose your data.frame, i.e swap rows and columns. This is done with t
.
Upvotes: 0
Reputation: 2226
I think this is what you want assuming that the columns you are talking about have two levels - "FALSE" and "TRUE".
df = data.frame(a=c("\"true\"","\"false\""), b=c("\"FALSE\"","\"TRUE\""), c=c("TRUE","FALSE"))
df
# a b c
# 1 "true" "FALSE" TRUE
# 2 "false" "TRUE" FALSE
ftlev = c("\"FALSE\"", "\"TRUE\"")
df2 = lapply(df, FUN = function(x) {
if (identical(ftlev,toupper(levels(x)))) {
x = gsub('"','',x)
}
return(x)
})
as.data.frame(df2)
Output:
a b c
1 true FALSE TRUE
2 false TRUE FALSE
The as.logical()
function has been proposed in other answers/comments but it does not produce the expected output:
df2 = lapply(df, FUN = function(x) {
if (identical(ftlev,toupper(levels(x)))) {
x = as.logical(x)
}
return(x)
})
as.data.frame(df2)
Output:
a b c
1 NA NA TRUE
2 NA NA FALSE
Upvotes: 0
Reputation: 146020
If you give a better example (with some columns to convert, some columns not to convert), I'm happy to test. From your description, I think this will work:
data = lapply(data, FUN = function(x) {
if (is.factor(x) & all(toupper(levels(x)) %in% c("TRUE", "FALSE"))) {
return(as.logical(x))
}
return(x)
})
It tests if the column is a factor and if its levels can be coerced to TRUE and FALSE, converts it to logical if yes, returns the column unchanged if no.
Upvotes: 2