Ankit Kumar Singh
Ankit Kumar Singh

Reputation: 41

Select a range of rows from every n rows from a data frame

I have 2880 observations in my data.frame. I have to create a new data.frame in which, I have to select rows from 25-77 from every 96 selected rows.

df.new = df[seq(25, nrow(df), 77), ] # extract from 25 to 77

The above code extracts only row number 25 to 77 but I want every row from 25 to 77 in every 96 rows.

Upvotes: 2

Views: 525

Answers (4)

Ronak Shah
Ronak Shah

Reputation: 388907

You can use recycling technique to extract these rows :

from = 25
to = 77
n = 96

df.new <- df[rep(c(FALSE, TRUE, FALSE), c(from - 1, to - from + 1, n - to))), ]

To explain for this example it will work as :

length(rep(c(FALSE, TRUE, FALSE), c(24, 53, 19))) #returns
#[1] 96

In these 96 values, value 25-77 are TRUE and rest of them are FALSE which we can verify by :

which(rep(c(FALSE, TRUE, FALSE), c(24, 53, 19)))
# [1] 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46
#[23] 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68
#[45] 69 70 71 72 73 74 75 76 77

Now this vector is recycled for all the remaining rows in the dataframe.

Upvotes: 1

Darren Tsai
Darren Tsai

Reputation: 35554

A one-liner base solution.

lapply(split(df, cut(1:nrow(df), nrow(df)/96, F)), `[`, 25:77, )

Note: Nothing after the last comma


The code above returns a list. To combine all data together, just pass the result above into

do.call(rbind, ...)

Upvotes: 0

Ric S
Ric S

Reputation: 9247

One option is to create a vector of indeces with which subset the dataframe.

idx <- rep(25:77, times = nrow(df)/96) + 96*rep(0:29, each = 77-25+1)
df[idx, ]

Upvotes: 1

Limey
Limey

Reputation: 12461

First, define a Group variable, with values 1 to 30, each value repeating 96 times. Then define RowWithinGroup and filter as required. Finally, undo the changes introduced to do the filtering.

df <- tibble(X=rnorm(2880)) %>% 
        add_column(Group=rep(1:96, each=30)) %>% 
        group_by(Group) %>% 
        mutate(RowWithinGroup=row_number()) %>% 
        filter(RowWithinGroup >= 25 & RowWithinGroup <= 77) %>% 
        select(-Group, -RowWithinGroup) %>% 
        ungroup()

Welcome to SO. This question may not have been asked in this exact form before, but the proinciples required have been rerefenced in many, many questions and answers,

Upvotes: 0

Related Questions