Reputation: 31
I have a somewhat (for most people) easy to answer and really basic question - probably.
Imagine having a simple simple and normal dataframe with 20 rows (columns don't matter in this example). Is there a way for me, to get all the rows following a specific selection pattern in terms of numbers? E.g.: I want the first 3 rows, skip the next 5 and then get the following 3 rows after the skipped ones --> after the 3 have been selected, skip the next 5 rows and so on until the end of the data frame is reached. --> rows and their specific column
Basically: RowsOfInterest, SkipThisAmountOfRows, RowsOfInterest, SkipThisAmountOfRows being for exmaple: 1:3, 5, next 1:3 (after the 5 skipped ones), 5, 1:3 and so on.
Help would be appreciated - thanks in advance!
Upvotes: 0
Views: 483
Reputation: 34703
It may be easier to think of this in terms of modular arithmetic.
You have a pattern that repeats every 8 rows, so consider the row number modulo 8:
df[seq_len(nrow(df)) %% 8L %in% 1:3, ]
seq_len(nrow(df))
creates a vector 1, 2, 3, ..., nrow(df)
.
In data.table
, this could be slightly cleaner:
df[1:.N %% 8L %in% 1:3]
This also makes clearer that there's a bit of an order of operations issue -- which comes first, %%
or %in%
? This is in ?Syntax
:
Within an expression operators of equal precedence are evaluated from left to right...
Upvotes: 1
Reputation: 28695
You can create a logical vector containing the pattern (e.g. 3 TRUEs then 5 FALSEs), then that pattern will automatically be recycled (repeated) for the number of rows in your df when subset it, since this is a logical vector.
df <- data.frame(rownum = 1:20, anothercol = letters[1:20])
df[rep(c(TRUE, FALSE), c(3, 5)),]
# rownum anothercol
# 1 1 a
# 2 2 b
# 3 3 c
# 9 9 i
# 10 10 j
# 11 11 k
# 17 17 q
# 18 18 r
# 19 19 s
Upvotes: 4