Daniel
Daniel

Reputation: 33

Trying to remove rows based on values in two columns

My data is formatted something like this:

 name  value1  rem 
-------------------
| tom  | 1   |  1
| tom  | 3   |  0
| tom  | 5   |  0
| bill | 7   |  0
| bill | 1   |  1
| bill | 3   |  0
| mark | 5   |  0
| mark | 9   |  0
| mark | 9   |  0

What I'm trying to do is remove any row that has a 1 in "rem" and any row that has the same ID as a row with 1 in "rem." So after the transformation I want it would look like:

 name  value1  rem 
-------------------
| mark | 5   |  0
| mark | 9   |  0
| mark | 9   |  0

I can't figure out how to do this in R using a logic command. My actual data has far more rows and columns so I can't just delete them by location, i.e. just deleting the first 6 rows. I get how to delete any row with a specific value. What I can't figure out is how to delete rows based on values in two rows one of which is conditional. Here is some R code that made a data frame like above:

name <- c("tom", "tom", "tom", "bill", "bill","bill","mark","mark","mark")
value1 <- c(1,3,5,7,1,3,5,9,9)
rem <- c(1,0,0,0,1,0,0,0,0)
df <- data.frame(name, value1, rem)

Upvotes: 2

Views: 8596

Answers (2)

din
din

Reputation: 692

Another way to do this:

# get the names that has 1 rem
# then identify names not in that subset and 
# use it to subset the df
df[!(df$name %in% df$name[df$rem == 1]), ]

Upvotes: 2

Nick Knauer
Nick Knauer

Reputation: 4243

You could do it this way:

install.packages('dplyr')
library(dplyr)
newdf<- df %>%
  group_by(name)%>%
  summarise(rem = sum(rem))

newdf2<-filter(newdf, rem<1)

Upvotes: 0

Related Questions