statsguyz
statsguyz

Reputation: 459

Identifying rows with values equal to 0 in R

I'm trying to find a way to identify columns within a data frame that have any entries equal to 0. If that particular column has a value equal to 0, I would like to create a new data frame without the columns that contain a zero value. In addition, I would like to create a list of the columns retained and the columns removed.

Example:

    dataframe1:

Column1       Column2     Column3     Column4
    .03           .05         .07         .08
    .01           .09         .22         .39
      0           .56         .88         .56

    dataframe2:

Column1       Column2     Column3     Column4
    .03           .05         .07         .08
    .01           .09         .22         .39
      0           .56         .88         .56

   retainedColumns = 2, 3, 4
   removedColumns = 1

I figured this could be done in dplyr easily. As for creating a new data frame, my attempt at the code (keeps crashing):

dataframe2<-dataframe1[!dataframe1 %in% 0, ] 

Any help would be appreciated.

Upvotes: 0

Views: 5149

Answers (1)

Pierre Lapointe
Pierre Lapointe

Reputation: 16277

You could do the following. Basically, you are summing the number of zeroes in each columns with colSums(df==0). The ! before excludes all columns that do not have a sum of zeroes equal to 0.

df[!colSums(df==0)]

  Column2 Column3 Column4
1    0.05    0.07    0.08
2    0.09    0.22    0.39
3    0.56    0.88    0.56

And here's how to get a list of columns retained and removed:

retainedColumns <- which(!colSums(df==0)) 
#Column2 Column3 Column4 
#  2       3       4 

removedColumns <- which(colSums(df==0) > 0)
#Column1 
#  1 

#A double negation (!!) would also work here:
removedColumns <- which(!!colSums(df==0))

DATA

df <- read.table(text="Column1 Column2 Column3 Column4
.03 .05 .07 .08
0 .09 .22 .39
0 .56 .88 .56", header=TRUE, stringsAsFactors=FALSE)

Upvotes: 4

Related Questions