Reputation: 199
My data consists of word lists from different texts (the group variable), and I'm trying to bin the dataframe within each group by a certain number of rows (every 2000 rows).
My data look like this:
index text word
1 H6 mællte
2 H6 fleiru
...
66265 H6 han
1 DG8 Son
2 DG8 hins
3 DG8 var
...
2001 DG8 faer
2002 DG8 hælga
I would like it to look like this:
index text word bin
1 H6 mællte 1
2 H6 fleiru 1
...
66265 H6 han 33
1 DG8 Son 1
2 DG8 hins 1
3 DG8 var 1
...
2001 DG8 faer 2
2002 DG8 hælga 2
Upvotes: 1
Views: 317
Reputation: 18681
We can use rep
with dplyr
:
library(dplyr)
df %>%
group_by(text) %>%
mutate(bin = rep(1:ceiling(n()/2000), each = 2000, length.out = n()))
length.out = n()
makes sure that if n()
is not divisible by 2000
, the last "bin" value will repeat only up till the Nth row per group.
Upvotes: 1