Reputation: 8247
I have following dataframe in R,which I want to arrange and generate sequence column
bay row tier flag
43 11 88 NA
43 11 90 NA
43 11 86 NA
43 9 88 NA
43 9 92 NA
13 4 88 89
13 4 90 NA
13 4 94 NA
11 4 88 89
11 7 90 NA
I want to arrange column tier
in descending order grouping by bay and row
Dataframe should look like
bay row tier flag
43 11 90 NA
43 11 88 NA
43 11 86 NA
43 9 92 NA
43 9 88 NA
13 4 94 NA
13 4 90 NA
13 4 88 89
11 7 90 NA
11 4 88 89
Then I want to generate a sequence for this arranged dataframe
bay row tier flag seq
43 11 90 NA 1
43 11 88 NA 2
43 11 86 NA 3
43 9 92 NA 4
43 9 88 NA 5
13 4 94 NA 6
13 4 90 NA 7
13 4 88 89 8
11 7 90 NA 9
11 4 88 89 8
The only thing I want to check while generating seq is when flag
is duplicated I want to keep same sequence number
for both rows otherwise different numbers for NA
values
bay,row and tier
is in integer
format
Upvotes: 2
Views: 124
Reputation: 886948
We can use data.table
. Convert the 'data.frame' to 'data.table' (setDT(df1)
), order
the 'bay', 'row' and 'tier' in descending order, then create a sequence column 'seq', get the index where the 'flag' are duplicated and is not an NA
, then specifying the i
as the logical index and grouped by 'flag' we change the 'seq' to the first value of 'seq'
library(data.table)
df2 <- setDT(df1)[order(-bay, -row, -tier)][, seq := seq_len(.N)]
i1 <- df2[, (duplicated(flag)|duplicated(flag, fromLast = TRUE)) & !is.na(flag)]
df2[i1, seq := seq[1L], .(flag)]
df2
# bay row tier flag seq
# 1: 43 11 90 NA 1
# 2: 43 11 88 NA 2
# 3: 43 11 86 NA 3
# 4: 43 9 92 NA 4
# 5: 43 9 88 NA 5
# 6: 13 4 94 NA 6
# 7: 13 4 90 NA 7
# 8: 13 4 88 89 8
# 9: 11 7 90 NA 9
#10: 11 4 88 89 8
Upvotes: 2