Reputation: 3126
I´m trying to clean the factor variables in a dataframe from trailing spaces. However the levels assignment doesnt work inside my lapply function.
rm.space<-function(x){
a<-gsub(" ","",x)
return(a)}
lapply(names(barn),function(x){
levels(barn[,x])<-rm.space(levels(barn[,x]))
})
Any ideas how I can assign levels inside a lapply function?
//M
Upvotes: 1
Views: 2704
Reputation: 50704
As Joris states lapply
works on local copy of data.frame
, so it won't modify your original data. But you could use it to replace your data:
barn[] <- lapply(barn, function(x) {
levels(x) <- rm.space(levels(x))
x
})
It is useful when you have different types in data and want to modify only factor
's, e.g.:
factors <- sapply(barn, is.factor)
barn[factors] <- lapply(barn[factors], function(x) {
levels(x) <- rm.space(levels(x))
x
})
Upvotes: 0
Reputation: 108543
From your code I read that the lapply is used to loop over different variables, not over the levels of the factor. So then you do need some kind of looping structure, but lapply is a bad choice:
Anyway, in case you need to assign something to a variable in your global environment within a lapply, you need the <<- operator. Say you need to have a number of variables you selected where the spaces have to be removed:
f <- paste("",letters[1:5])
Df <- data.frame(
X1 = sample(f,10,r=T),
X2 = sample(f,10,r=T),
X3 = sample(f,10,r=T)
)
# Bad example :
lapply(c("X1","X3"),function(x){
levels(Df[,x])<<-gsub(" +","",levels(Df[,x]))
})
gives
> str(Df)
'data.frame': 10 obs. of 3 variables:
$ X1: Factor w/ 3 levels "a","b","c": 2 3 1 1 1 2 3 2 2 2
$ X2: Factor w/ 5 levels " a"," b"," c",..: 4 5 4 2 5 5 1 2 5 3
$ X3: Factor w/ 5 levels "a","b","c","d",..: 2 3 4 1 4 1 3 3 5 4
Better is to use a for loop :
for( i in c("X1","X3")){
levels(Df[,i])<-gsub(" +","",levels(Df[,i]))
}
Does what you need without the hassle of the <<- operator and without holding memory unnecessarily.
Upvotes: 1
Reputation: 368251
R is vectorised, you do not need apply()
:
> f <- as.factor(sample(c(" a", " b", "c", " d"), 10, replace=TRUE))
> levels(f)
[1] " a" " b" "c" " d"
> levels(f) <- gsub(" +", "", levels(f), perl=TRUE)
> levels(f)
[1] "a" "b" "c" "d"
> f
[1] d a c b c d d a a a
Levels: a b c d
>
Upvotes: 6