Reputation: 41
I am trying to remove columns in a data frame df
which has 0 and below is my syntax.
df_new<-df[,which(colSums(df) !=0)]
I am getting an error as
Error in colSums(df) : 'x' must be numeric.
What am I doing wrong?
Upvotes: 0
Views: 57
Reputation: 516
This should work
df[,sapply(df,function(V) sum(V==0)==0)]
EDIT
The above code should naturally work for all numeric columns, but what about factor columns or character columns with "0", do we have the same expected behavior? We can do a few tests:
factor(letters[1:5]) == 0
# FALSE FALSE FALSE FALSE FALSE
factor(c(0:5)) == 0
# TRUE FALSE FALSE FALSE FALSE FALSE
as.character(c(0:5)) == 0
# TRUE FALSE FALSE FALSE FALSE FALSE
c(0,letters[1:5]) == 0
# TRUE FALSE FALSE FALSE FALSE FALSE
factor(c(0,letters[1:5])) == 0
# TRUE FALSE FALSE FALSE FALSE FALSE
What happens is that R convert 0
(numeric) on the RHS into "0"
(character) and also the factor column on the LHS into character. So the code should generally work fine if you really want to remove any column with "0" no matter it is a number or character. But if the intention is to always retain the character or factor column, then sth like this might help
df[,sapply(df,function(V) sum(V==0)==0 | is.character(V) | is.factor(V))]
Upvotes: 1