Reputation: 11

How can I subset my data into intervals

I'm new to R and I'm trying to get my script more efficient. I have a data.frame of 25480 observations and 17 variables.

One of my variables is Subject and each subject has its number. However, the number of observations (lines) for each subject is not equal. I would like to separate my subjects into groups, according to their number. How can I do it?

Before I used this formula:

gaze <- subset(gaze, Subject != "261" & Subject != "270" & Subject != "275")

But now I have too many subjects to repeat Subject each time. Is it possible to define interval of subjects to cut or to split. I tried this command but it doesn’t seem to work:

gazeS <- (gaze$Subject[112:216])
cut(gaze, seq(gaze, from = 112, to = 116))

Could you help me to fix this code, please?

Upvotes: 1

Answers (2)

IRTFM

Reputation: 263481

Since there is no ordering method for factor variables (even if they appear numeric) you need to convert first for any ordering operation to work and the R-FAQ says to use :

as.numeric(as.character(fac))

So:

subset(gaze, !as.numeric(as.character(Subject)) in 260:280)

Or:

subset(gaze, !( as.numeric(as.character(Subject)) >= 260 &
            as.numeric(as.character(Subject)) <= 280)  )

Or:

subset( gaze, !Subject %in% as.character(260:280) )

Upvotes: 1

nico

Reputation: 51680

If I correctly understand what you need, you could use something like

gaze$Subject <- as.integer(as.charachter(gaze$Subject))
gaze <- subset(gaze, Subject >= 261 & Subject <= 280)

It is important to cast the id as character otherwise funny things may happen with factor levels being ordered alphabetically and not numerically. The best thing to avoid this, however, is to directly set column classes when reading the data (e.g. with the colClasses parameter of read.table).

Upvotes: 0

How can I subset my data into intervals

Answers (2)

Related Questions