accibio
accibio

Reputation: 559

Subsetting a dataframe in R based on positions of elements in the columns

I have a dataframe which has 5 rows and 6 columns. may or may not have elements in columns to

Case 1: having elements in to

df <- data.frame(
  Hits = c("Hit1", "Hit2", "Hit3", "Hit4", "Hit5"),
  category1 = c("a1", "", "b1", "a1", "c1"),
  category2 = c("", "", "", "", "a2"),
  category3 = c("a3", "", "b3", "", "a3"),
  category4 = c("", "", "", "", ""),
  category5 = c("", "", "a5", "b5", ""),
  stringsAsFactors = FALSE)

enter image description here

Case 2: having no elements in to

enter image description here

For Case 1, from each of the columns to , I need to retain only the elements which appear at the topmost position i.e.

enter image description here

and finally, drop the rows having no elements in these five columns, i.e.

enter image description here

For Case 2, I would like to retain only the topmost row i.e

enter image description here

How do I merge together the solutions to both the cases?

Upvotes: 0

Views: 89

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 389055

You can use a conditional statement to handle the two cases and include it in slice -

library(dplyr)

df %>%
  mutate(across(starts_with('category'), ~replace(., -match(TRUE, . != ''), ''))) %>%
  slice({
    tmp <- if_any(starts_with('category'), ~. != '')
    if(any(tmp)) which(tmp) else 1
  })

For case 1 this returns -

#  Hits category1 category2 category3 category4 category5
#1 Hit1        a1                  a3                    
#2 Hit3                                                a5
#3 Hit5                  a2                              

For case 2 -

#  Hits category1 category2 category3 category4 category5
#1 Hit1                            

Upvotes: 2

Related Questions