Reputation: 7986
What is a good way to conditionally drop a data frame column based on the sum of its values?
For example, in the following data frame, I want to drop all columns where the sum of the values are zero.
df = data.frame(Dum1=c(0,0,0,1,0,0,0,0,0,0),
Dum2=c(0,0,0,0,0,0,0,0,0,0),
Dum3=c(0,0,0,1,0,1,0,0,0,0),
Dum4=c(0,0,0,0,0,0,0,0,0,0))
colSums(as.matrix(df))
Dum1 Dum2 Dum3 Dum4
1 0 2 0
Dum2 and Dum4 are all zeros, so I would like to drop them. Unfortunately, in my application I will not know in advance which columns sum to zero or I could drop them using something like this:
df$Dum2 <- NULL
df$Dum4 <- NULL
str(df)
'data.frame': 10 obs. of 2 variables:
$ Dum1: num 0 0 0 1 0 0 0 0 0 0
$ Dum3: num 0 0 0 1 0 0 0 0 0 0
Any assistance would be greatly appreciated
Upvotes: 0
Views: 121