Reputation: 85
I am trying to fill out the NA values based on a lookup in the item_code column. Basically if the item_code has a assigned section, I would like it to look at the item_code in that row and check if there is a section assigned to the code elsewhere in the data, if yes use that or else NA. This is a huge data set.
item_code section
1 50406737 556
2 48147401 NA
3 49762314 NA
4 47860166 557
5 48147401 557
6 49762314 NA
7 49762314 554
8 50884988 554
9 50856064 NA
10 49762314 554
11 50868629 556
12 51041955 556
13 50856064 NA
14 48147401 NA
15 50460172 557
16 50856064 559
17 47860166 557
18 50459661 557
Upvotes: 0
Views: 66
Reputation: 5932
This should do the trick (notice that I added and additional item_code
in the table to add the case in which an item_code
only has NA
values in section
, which was missing from your example data)
require(tidyverse)
df= read.table(text =
"item_code section
1 50406737 556
2 48147401 NA
3 49762314 NA
4 47860166 557
5 48147401 557
6 49762314 NA
7 49762314 554
8 50884988 554
9 50856064 NA
10 49762314 554
11 50868629 556
12 51041955 556
13 50856064 NA
14 48147401 NA
15 50460172 557
16 50856064 559
17 47860166 557
18 50459661 557
19 50459662 NA",
header = TRUE
)
df2 <- df %>%
group_by(item_code) %>%
mutate(section = max(section, na.rm = T)) %>%
distinct(section) %>%
print()
Source: local data frame [11 x 2]
Groups: item_code [11]
section item_code
<int> <int>
1 556 50406737
2 557 48147401
3 554 49762314
4 557 47860166
5 554 50884988
6 559 50856064
7 556 50868629
8 556 51041955
9 557 50460172
10 557 50459661
11 NA 50459662
Upvotes: 1