Reputation: 459
I'm trying to find a way to identify columns within a data frame that have any entries equal to 0. If that particular column has a value equal to 0, I would like to create a new data frame without the columns that contain a zero value. In addition, I would like to create a list of the columns retained and the columns removed.
Example:
dataframe1:
Column1 Column2 Column3 Column4
.03 .05 .07 .08
.01 .09 .22 .39
0 .56 .88 .56
dataframe2:
Column1 Column2 Column3 Column4
.03 .05 .07 .08
.01 .09 .22 .39
0 .56 .88 .56
retainedColumns = 2, 3, 4
removedColumns = 1
I figured this could be done in dplyr easily. As for creating a new data frame, my attempt at the code (keeps crashing):
dataframe2<-dataframe1[!dataframe1 %in% 0, ]
Any help would be appreciated.
Upvotes: 0
Views: 5149
Reputation: 16277
You could do the following. Basically, you are summing the number of zeroes in each columns with colSums(df==0)
. The !
before excludes all columns that do not have a sum of zeroes equal to 0.
df[!colSums(df==0)]
Column2 Column3 Column4
1 0.05 0.07 0.08
2 0.09 0.22 0.39
3 0.56 0.88 0.56
And here's how to get a list of columns retained and removed:
retainedColumns <- which(!colSums(df==0))
#Column2 Column3 Column4
# 2 3 4
removedColumns <- which(colSums(df==0) > 0)
#Column1
# 1
#A double negation (!!) would also work here:
removedColumns <- which(!!colSums(df==0))
DATA
df <- read.table(text="Column1 Column2 Column3 Column4
.03 .05 .07 .08
0 .09 .22 .39
0 .56 .88 .56", header=TRUE, stringsAsFactors=FALSE)
Upvotes: 4