Reputation: 559
I have a dataframe df which has 5 rows and 6 columns. df may or may not have elements in columns category1 to category5
Case 1: df having elements in category1 to category5
df <- data.frame(
Hits = c("Hit1", "Hit2", "Hit3", "Hit4", "Hit5"),
category1 = c("a1", "", "b1", "a1", "c1"),
category2 = c("", "", "", "", "a2"),
category3 = c("a3", "", "b3", "", "a3"),
category4 = c("", "", "", "", ""),
category5 = c("", "", "a5", "b5", ""),
stringsAsFactors = FALSE)
Case 2: df having no elements in category1 to category5
For Case 1, from each of the columns category1 to category5, I need to retain only the elements which appear at the topmost position i.e.
and finally, drop the rows having no elements in these five columns, i.e.
For Case 2, I would like to retain only the topmost row i.e
How do I merge together the solutions to both the cases?
Upvotes: 0
Views: 89
Reputation: 389055
You can use a conditional statement to handle the two cases and include it in slice
-
library(dplyr)
df %>%
mutate(across(starts_with('category'), ~replace(., -match(TRUE, . != ''), ''))) %>%
slice({
tmp <- if_any(starts_with('category'), ~. != '')
if(any(tmp)) which(tmp) else 1
})
For case 1 this returns -
# Hits category1 category2 category3 category4 category5
#1 Hit1 a1 a3
#2 Hit3 a5
#3 Hit5 a2
For case 2 -
# Hits category1 category2 category3 category4 category5
#1 Hit1
Upvotes: 2