Reputation: 915
I would like to remove the initial consecutive NA's(for Sales and Price only) that appears for each product group, until a first non-NA value appears for each product group.
S.No,Product,Date,Sales,Price,Count
1,Product1,1/1/2018,NA,NA,7
2,Product1,2/1/2018,NA,NA,6
3,Product1,3/4/2018,5,1,7
4,Product1,4/4/2018,NA,NA,3
5,Product1,5/5/2018,3,1,6
6,Product1,6/5/2018,2,4,10
7,Product1,7/6/2018,5,9,5
8,Product2,1/1/2018,NA,NA,6
9,Product2,2/1/2018,10,3,5
10,Product2,3/4/2018,NA,NA,2
11,Product2,4/4/2018,8,3,8
12,Product2,5/5/2018,3,6,7
13,Product2,6/5/2018,7,4,6
14,Product2,7/6/2018,10,3,3
15,Product3,2/1/2018,NA,NA,7
16,Product3,3/4/2018,NA,NA,2
17,Product3,4/4/2018,NA,NA,1
18,Product3,5/5/2018,NA,NA,7
19,Product3,6/5/2018,10,7,9
20,Product3,7/6/2018,6,3,7
Below is the output i am looking to extract.
S.No,Product,Date,Sales,Price,Count
3,Product1,3/4/2018,5,1,7
4,Product1,4/4/2018,NA,NA,3
5,Product1,5/5/2018,3,1,6
6,Product1,6/5/2018,2,4,10
7,Product1,7/6/2018,5,9,5
9,Product2,2/1/2018,10,3,5
10,Product2,3/4/2018,NA,NA,2
11,Product2,4/4/2018,8,3,8
12,Product2,5/5/2018,3,6,7
13,Product2,6/5/2018,7,4,6
14,Product2,7/6/2018,10,3,3
19,Product3,6/5/2018,10,7,9
20,Product3,7/6/2018,6,3,7
Would appreciate if someone can help me with the approach to tackle this problem. Is there any package or function in R, that takes care of this scenario.
Any leads would be highly appreciated.
Thanks, J
Upvotes: 0
Views: 51
Reputation: 79238
dat%>%
group_by(Product)%>%
filter_at(vars(4:5),all_vars(cumsum(!is.na(.))))
# A tibble: 13 x 6
# Groups: Product [3]
S.No Product Date Sales Price Count
<int> <fct> <fct> <int> <int> <int>
1 3 Product1 3/4/2018 5 1 7
2 4 Product1 4/4/2018 NA NA 3
3 5 Product1 5/5/2018 3 1 6
4 6 Product1 6/5/2018 2 4 10
5 7 Product1 7/6/2018 5 9 5
6 9 Product2 2/1/2018 10 3 5
7 10 Product2 3/4/2018 NA NA 2
8 11 Product2 4/4/2018 8 3 8
9 12 Product2 5/5/2018 3 6 7
10 13 Product2 6/5/2018 7 4 6
11 14 Product2 7/6/2018 10 3 3
12 19 Product3 6/5/2018 10 7 9
13 20 Product3 7/6/2018 6 3 7
Upvotes: 2