Omid Mehrpour
Omid Mehrpour

Reputation: 3

delete rows with a specific sum in R

I have a dataframe with 140 columns and 2000 rows. i want t finds rows with sum of columns(2:131)=1. and if any of these condition is true seizuremutidescrete =1 or seizuresingle=1 or seizurestatus =1

then delete those rows:

bupro%>%select(bupro[rowSums(bupro[2:131]==1)]&&((bupro["Seizuresingle"] =1 | bupro["Seizuresstatus"]| bupro["Seizuresmultidiscrete"|=1)) 


any help is appreciated

Upvotes: 0

Views: 58

Answers (1)

philiptomk
philiptomk

Reputation: 763

As stated by @onyambu, you need to use filter rather than select, moreover, to delete the rows that fit the criteria, you must put a ! in front of the overall condition:

library(tidyverse)
set.seed(100)
df <- tibble(
  seizuresingle = sample(0:1, 100, replace = T),
  seizuremultidiscrete = sample(0:1, 100, replace = T),
  seizurestatus = sample(0:1, 100, replace = T),
  col1 = sample(seq(0, 1, by = 0.1), 100, replace = T),
  col2 = sample(seq(0, 1, by = 0.1), 100, replace = T),
  col3 = sample(seq(0, 1, by = 0.1), 100, replace = T),
  col4 = sample(seq(0, 1, by = 0.1), 100, replace = T)
)
df %>% 
  rowwise() %>% 
  filter(!(sum(c_across(starts_with("col"))) == 1 && sum(c_across(starts_with("seiz"))) >= 1))
#> # A tibble: 94 × 7
#> # Rowwise: 
#>    seizuresingle seizuremultidiscrete seizurestatus  col1  col2  col3  col4
#>            <int>                <int>         <int> <dbl> <dbl> <dbl> <dbl>
#>  1             1                    1             1   0.9   0.4   0.5   0.9
#>  2             0                    0             1   0.2   0     0.4   0  
#>  3             1                    0             0   1     1     0.5   0.5
#>  4             1                    0             0   0.7   0.1   0.6   0.2
#>  5             0                    1             1   0.9   0.4   0.4   0.9
#>  6             0                    0             0   0.3   0.7   0.5   0.9
#>  7             1                    1             1   0.6   0.4   0.6   0.2
#>  8             1                    1             1   0     0.4   0     0.5
#>  9             1                    0             0   0.2   0.3   0.9   0.4
#> 10             0                    0             1   0.8   0     0.3   0.3
#> # … with 84 more rows

If we take out ! we can see the rows that were filtered out:

df %>% 
  rowwise() %>% 
  filter((sum(c_across(starts_with("col"))) == 1 && sum(c_across(starts_with("seiz"))) >= 1))
#> # A tibble: 6 × 7
#> # Rowwise: 
#>   seizuresingle seizuremultidiscrete seizurestatus  col1  col2  col3  col4
#>           <int>                <int>         <int> <dbl> <dbl> <dbl> <dbl>
#> 1             1                    0             1   0.1   0.4   0.2   0.3
#> 2             1                    1             1   0.5   0.4   0     0.1
#> 3             0                    1             0   0.7   0.1   0     0.2
#> 4             0                    0             1   0.2   0     0     0.8
#> 5             1                    1             0   0     0.4   0.1   0.5
#> 6             1                    1             1   0.1   0     0.7   0.2

Upvotes: 1

Related Questions