Nick Knauer
Nick Knauer

Reputation: 4243

Replace Value with another value in same column by Group

I have a dataset below:

       Date    Group  Value
 2015-02-15        A     10
 2015-02-23        A    422
 2015-03-02        A     89
 2015-02-15        B     32
 2015-02-23        B     11
 2015-02-15        C     30
 2015-03-02        C      2

I want to make a rule where, for each group, if a value appears for 2015-02-15, replace that value with the value of Date 2015-03-02 by group.

Expected output:

       Date    Group  Value
 2015-02-15        A     89    <----replaced
 2015-02-23        A    422
 2015-03-02        A     89
 2015-02-15        B     32    <----not replaced since 2015-03-02 doesn't exist
 2015-02-23        B     11
 2015-02-15        C      2    <----replaced
 2015-03-02        C      2  

Upvotes: 0

Views: 1264

Answers (1)

Jake Thompson
Jake Thompson

Reputation: 2843

Here's a solution using the tidyverse. The idea is to create another data frame of replacement values. I did this by filtering to only March 2 (since these are the values you want to use). Then I join that onto the original data frame. From there, you can use case_when to say, if the date is February 15 and a new value is available, use that, otherwise use the original value.

library(tidyverse)

df <- data_frame(
  Date = c("2015-02-15", "2015-02-32", "2015-03-02", "2015-02-15",
    "2015-02-23", "2015-02-15", "2015-03-02"),
  Group = c("A", "A", "A", "B", "B", "C", "C"),
  Value = c(10, 422, 89, 32, 11, 30, 2)
)

df
#> # A tibble: 7 x 3
#>   Date       Group Value
#>   <chr>      <chr> <dbl>
#> 1 2015-02-15 A        10
#> 2 2015-02-32 A       422
#> 3 2015-03-02 A        89
#> 4 2015-02-15 B        32
#> 5 2015-02-23 B        11
#> 6 2015-02-15 C        30
#> 7 2015-03-02 C         2

replace_values <- df %>%
  filter(Date == "2015-03-02") %>%
  select(Group, new_value = Value)

replace_values
#> # A tibble: 2 x 2
#>   Group new_value
#>   <chr>     <dbl>
#> 1 A            89
#> 2 C             2

df %>%
  group_by(Group) %>%
  left_join(replace_values, by = "Group") %>%
  mutate(
    Value = case_when(
      Date == "2015-02-15" & !is.na(new_value) ~ new_value,
      TRUE ~ Value
    )
  ) %>%
  select(-new_value)
#> # A tibble: 7 x 3
#> # Groups:   Group [3]
#>   Date       Group Value
#>   <chr>      <chr> <dbl>
#> 1 2015-02-15 A        89
#> 2 2015-02-32 A       422
#> 3 2015-03-02 A        89
#> 4 2015-02-15 B        32
#> 5 2015-02-23 B        11
#> 6 2015-02-15 C         2
#> 7 2015-03-02 C         2

Created on 2018-10-31 by the reprex package (v0.2.1)

Upvotes: 1

Related Questions