Reputation: 43
I have a dataframe that contains several fields related to an identifier but some are disjointed:
id store manager fruit vegetable
1 Grocery1 Joe apple NA
1 Grocery1 Joe lemon NA
1 Grocery1 Joe NA zucchini
2 Grocery2 Amy orange NA
2 Grocery2 Amy NA asparagus
2 Grocery2 Amy NA spinach
3 Grocery3 Bill NA NA
I want the dataframe to look like:
id store manager fruit vegetable
1 Grocery1 Joe apple zucchini
1 Grocery1 Joe lemon zucchini
2 Grocery2 Amy orange asparagus
2 Grocery2 Amy orange spinach
3 Grocery3 Bill NA NA
Is there a way to easily do this?
Upvotes: 1
Views: 1293
Reputation: 19097
You can use tidyr::fill
to fill
the NA
, and only keep the non-duplicated rows using distinct
.
library(dplyr)
library(tidyr)
df %>%
group_by(store, manager) %>%
fill(fruit, vegetable, .direction = "updown") %>%
distinct()
# A tibble: 5 × 5
# Groups: store, manager [3]
id store manager fruit vegetable
<int> <chr> <chr> <chr> <chr>
1 1 Grocery1 Joe apple zucchini
2 1 Grocery1 Joe lemon zucchini
3 2 Grocery2 Amy orange asparagus
4 2 Grocery2 Amy orange spinach
5 3 Grocery3 Bill NA NA
Upvotes: 1