Maleeha Shahid
Maleeha Shahid

Reputation: 135

Function for indexing in R

I have a dataset that reads something like this:


  record_id    <- c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3 )
  
  voucher_number  <- c("app1", "00000", "11111", "22222", "11111", "app2", "33333", "44444", "33333", 
                       "33333", "app", "55555", "66666", "55555", "66666", "55555, "77777 )

 
  ds <- data.frame(record_id, voucher_number, stringsAsFactors=FALSE)

   record_id voucher_number
1          1          
2          1          00000
3          1          11111
4          1          22222
5          1          11111
6          2         
7          2          33333
8          2          44444
9          2          33333
10         2          33333
11         3         
12         3          55555
13         3          66666
14         3          55555
15         3          66666
16         3          55555
17         3          77777

I want to write a function where after grouping by record_id I am creating a new variables lets say called Ice. I want the value of Ice to be app if voucher_number is missing. Otherwise I want to index voucher_number as 1 or 2 or 3 or so forth if voucher_number were the same for individual record_id and if its a new "voucher_number``` for the same record id and it was not repeated then I want it to be called as 1.

Something like the following:

   record_id voucher_number ice
1          1           app1 app
2          1          00000   1
3          1          11111   1
4          1          22222   1
5          1          11111   2
6          2           app2 app
7          2          33333   1
8          2          44444   1
9          2          33333   2
10         2          33333   3
11         3           app3 app
12         3          55555   1
13         3          66666   1
14         3          55555   2
15         3          66666   2
16         3          55555   3
17         3          77777   1

and ultimately I want the dataset to be ordered by record_id and voucher_number.

Thanks so much!

Upvotes: 0

Views: 152

Answers (1)

Ronak Shah
Ronak Shah

Reputation: 388982

We can create a row number for each value of record_id and voucher_number and replace ice value where voucher_number has "app" in it.

library(dplyr)

ds %>%
  group_by(record_id, voucher_number) %>%
  mutate(ice = row_number(), 
         ice = replace(ice, grep('app', voucher_number), 'app'))

#   record_id voucher_number ice  
#       <dbl> <chr>          <chr>
# 1         1 app1           app  
# 2         1 00000          1    
# 3         1 11111          1    
# 4         1 22222          1    
# 5         1 11111          2    
# 6         2 app2           app  
# 7         2 33333          1    
# 8         2 44444          1    
# 9         2 33333          2    
#10         2 33333          3    
#11         3 app            app  
#12         3 55555          1    
#13         3 66666          1    
#14         3 55555          2    
#15         3 66666          2    
#16         3 55555          3    
#17         3 77777          1    

Upvotes: 2

Related Questions