Reputation: 51
I have a set of orders with items and where they are sourced from. I need to add a unique counter per order for the item and for the supplier. Note that this unique counter is a unique counter to each order.
library('tidyverse')
library('partitions')
df <- read_table('Order Item Source
1 100 Supplier1
1 101 Supplier1
1 102 Supplier2
1 106 Supplier3
2 107 Supplier4
2 108 Supplier4
3 104 Supplier5
3 103 Supplier6')
The output should look something like this table:
Order Item ItemNum Source SourceNum
1 100 1 Supplier1 1
1 101 2 Supplier1 1
1 102 3 Supplier2 2
1 106 4 Supplier3 3
2 107 1 Supplier4 1
2 108 2 Supplier4 1
3 104 1 Supplier5 1
3 103 2 Supplier6 2
Upvotes: 2
Views: 69
Reputation: 2229
Consider trying this using group_by
and mutate
df %>% group_by(Order) %>%
mutate(ItemNum = row_number(Source),
SourceNum = cumsum(!duplicated(Source)))
# A tibble: 8 x 5
# Groups: Order [3]
Order Item Source ItemNum SourceNum
<int> <int> <chr> <int> <int>
1 1 100 Supplier1 1 1
2 1 101 Supplier1 2 1
3 1 102 Supplier2 3 2
4 1 106 Supplier3 4 3
5 2 107 Supplier4 1 1
6 2 108 Supplier4 2 1
7 3 104 Supplier5 1 1
8 3 103 Supplier6 2 2
Upvotes: 1
Reputation: 42544
For the sake of completeness, here is also a data.table
solution:
library(data.table)
setDT(df)[, ItemNum := rowid(Order)][
order(Source), SourceNum := rleid(Source), by = Order][]
Order Item Source ItemNum SourceNum 1: 1 100 Supplier1 1 1 2: 1 101 Supplier1 2 1 3: 1 102 Supplier2 3 2 4: 1 106 Supplier3 4 3 5: 2 107 Supplier4 1 1 6: 2 108 Supplier4 2 1 7: 3 104 Supplier5 1 1 8: 3 103 Supplier6 2 2
Upvotes: 1
Reputation: 57210
A possible solution using ave function :
df$ItemNum <- ave(1:nrow(df),df$Order,FUN=function(x) as.integer(as.factor(df$Item[x])))
df$SourceNum <- ave(1:nrow(df),df$Order,FUN=function(x) as.integer(as.factor(df$Source[x])))
> df
Order Item Source ItemNum SourceNum
1 1 100 Supplier1 1 1
2 1 101 Supplier1 2 1
3 1 102 Supplier2 3 2
4 2 107 Supplier4 1 1
5 1 106 Supplier3 4 3
6 2 108 Supplier4 2 1
7 3 104 Supplier5 2 1
8 3 103 Supplier6 1 2
Note that this code assumes that df$Source
and df$Item
are characters (not factors).
Upvotes: 0