Homer Jay Simpson
Homer Jay Simpson

Reputation: 1224

How can i select specific columns that start with a word according to a condition of another column in R using dplyr?

I have a data frame that looks like this :

date var cat_low dog_low cat_high dog_high Love Friend
2022-01-01 A 1 7 13 19 NA friend
2022-01-01 A 2 8 14 20 NA friend
2022-01-01 A 3 9 15 21 NA friend
2022-02-01 B 4 10 16 22 love NA
2022-02-01 B 5 11 17 23 love NA
2022-02-01 B 6 12 18 24 love NA

I want to select the columns related to columns Love and Friend. If the column Love is love to give the columns that starts with cat and if the column Friend is friend to give me the columns that start with dog.

ideally i want to look like this :

date var a b
2022-01-01 A 7 19
2022-01-01 A 8 20
2022-01-01 A 9 21
2022-02-01 B 4 16
2022-02-01 B 5 17
2022-02-01 B 6 18
library(lubridate)
date = c(rep(as.Date("2022-01-01"),3),rep(as.Date("2022-02-01"),3))
var = c(rep("A",3),rep("B",3))
cat_low = seq(1,6,1)
dog_low = seq(7,12,1)
cat_high = seq(13,18,1)
dog_high = seq(19,24,1)
Friend = c(rep("friend",3),rep(NA,3))
Love = c(rep(NA,3),rep("love",3))
df = tibble(date,var,cat_low,dog_low,cat_high,dog_high,Love,Friend);df


Any help? How i can do that in R using dplyr ?

Upvotes: 0

Views: 64

Answers (2)

Andre Wildberg
Andre Wildberg

Reputation: 19191

With dplyr try this. The first summarise filters for dog or cat, the second renames and puts the variables together.

library(dplyr)

df %>% 
  summarise(date, var, 
    across(starts_with("dog"), ~ .x[Friend == "friend"]), 
    across(starts_with("cat"), ~ .x[Love == "love"])) %>% 
  rename(a = dog_low, b = dog_high) %>% 
  summarise(date, var, a = ifelse(is.na(a), cat_low, a), 
    b = ifelse(is.na(b), cat_high, b))
        date var a  b
1 2022-01-01   A 7 19
2 2022-01-01   A 8 20
3 2022-01-01   A 9 21
4 2022-02-01   B 4 16
5 2022-02-01   B 5 17
6 2022-02-01   B 6 18

Upvotes: 1

Maël
Maël

Reputation: 52319

There might be better ways, but here's one:

library(tidyr)
library(dplyr)
df %>% 
  pivot_longer(cols = starts_with(c("cat", "dog")),
               names_to = c("animal", ".value"),
               names_pattern = "(cat|dog)_(low|high)") %>% 
  filter((is.na(Love) & animal == "dog") |
         (is.na(Friend) & animal == "cat")) %>% 
  select(date, var, low, high)

output

# A tibble: 6 × 4
  date       var     low  high
  <date>     <chr> <dbl> <dbl>
1 2022-01-01 A         7    19
2 2022-01-01 A         8    20
3 2022-01-01 A         9    21
4 2022-02-01 B         4    16
5 2022-02-01 B         5    17
6 2022-02-01 B         6    18

Upvotes: 1

Related Questions