Reputation: 802
In a new variable row2
, how to repeat a sequential numbering (here a sequence from 3 to 6) by group of duplicated row1
values, which would start from a given value (here from row1
= 3), even if the last sequence is incomplete (here 3 to 5 for example)?
Thanks for help
Desired output:
> dat1
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3 # start the sequence
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3 # repeat the sequence
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3 # and repeat again...
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5 # ...even if incomplete
Initial data:
row1 <- c(1,1,2,
3,4,4,4,5,5,6,6,6,
7,7,8,8,9,9,9,10,
11,11,11,12,13,13)
dat1 <- data.frame(row1)
Upvotes: 1
Views: 95
Reputation: 7979
You might want to write a more concise version from
dat1 |>
transform(row2 = {
i = row1 < 3
c(row1[i], with(rle(row1[!i]), rep(rep(3:6, length.out=length(lengths)), lengths)))
})
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5
If yoou like to apply to your data from previous question, we can wrap operations depending on group pdf
in a single tapply()
- or by()
-call, e.g.
tapply(dat0, ~pdf, \(x) {
x$row1 = with(rle(x$row0), rep(seq_along(values), lengths))
x$row2 = c(x$row1[x$row1 < 3], with(rle(x$row1[!x$row1 < 3]), rep(rep(3:6, length.out=length(lengths)), lengths)))
x
}) |> do.call(what='rbind') |> `row.names<-`(NULL) # cosmetics
if this
pdf page row0 row1 row2
1 x 3 5 1 1
2 x 3 5 1 1
3 x 3 5 1 1
4 x 3 5 1 1
5 x 3 6 2 2
6 x 3 6 2 2
7 x 3 6 2 2
8 x 3 7 3 3
9 x 3 7 3 3
10 x 4 1 4 4
11 x 4 1 4 4
12 x 4 1 4 4
13 x 4 2 5 5
14 x 4 2 5 5
15 x 4 2 5 5
16 x 4 2 5 5
17 x 4 3 6 6
18 y 6 2 1 1
19 y 6 2 1 1
20 y 6 3 2 2
21 y 6 3 2 2
22 y 6 3 2 2
23 y 6 4 3 3
24 y 6 4 3 3
25 y 7 1 4 4
26 y 7 1 4 4
27 y 7 1 4 4
28 y 7 1 4 4
29 y 7 2 5 5
30 y 7 2 5 5
31 y 7 2 5 5
32 y 7 3 6 6
33 y 8 1 7 3
34 y 8 1 7 3
35 y 8 2 8 4
is desired result. Have a look on rows 32-35. (Might be better to re-name row0
-3
to col0
-3
.)
The first anonymous function is very useful. We can wrap it in a custom function:
consecutive_id = \(x) with(rle(x), rep(seq_along(values), lengths))
Upvotes: 1
Reputation: 4147
You could use if_else
to apply modulo
to val >=3
(row1 - 3) %% 4
cycles through 1,2,3, effectively mapping row1
values into 3,4,5,6 repeatedly.
+3
shifts the sequence to start at 3.
Values of row1 < 3
are kept untouched
dat1$row2 <- if_else(dat1$row1 >= 3, (dat1$row1 - 3) %% 4 + 3, dat1$row1)
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5
Upvotes: 2
Reputation: 79184
We can do like this:
%>% .[. >= 3]
filters the sorted vector to only include those values that are greater than or equal to 3( ... - 1)
%% 4
+ 3
library(dplyr)
dat1 %>%
mutate(row2 = if_else(row1 < 3,
row1,
(
(match(
row1, sort(unique(row1)) %>%
.[. >= 3]) - 1) %% 4
) + 3
)
)
row1 row2
1 1 1
2 1 1
3 2 2
4 3 3
5 4 4
6 4 4
7 4 4
8 5 5
9 5 5
10 6 6
11 6 6
12 6 6
13 7 3
14 7 3
15 8 4
16 8 4
17 9 5
18 9 5
19 9 5
20 10 6
21 11 3
22 11 3
23 11 3
24 12 4
25 13 5
26 13 5
Upvotes: 2