antR
antR

Reputation: 907

Difference between listing things with ":" vs "> or <"

I have a data set of babynames I've been playing around with. I'm using dplyr to filter for babies born in what's considered the millenial age. So that would be any baby born from 1985 to 2005 (including 1985 and 2005). However, when I use dplyr I noticed that I get different filtered results depending on how I phrase the filter argument.

trial<-filter(babynames, year==1985:2005)
trial2<-filter(babynames, year >1984 & year<1986)

trial1 gives me ~70,000 results while trial2 has roughly double of that (~154,000). Is there a difference between these two forms of filtering? To me they should be giving me the same output? I feel like I am missing something here.

Upvotes: 0

Views: 52

Answers (1)

SmitM
SmitM

Reputation: 1376

To the best of my knowledge, 1985:2005 will not check for the year to be present in that range.
- It is most likely, checking the year in the first row with 1985, the 2nd with 1986, the 3rd with 1987 and so on...
- It does this till the 21st row where it checks with 2005 and then the values get recycled. Meaning, the 22nd row is checked with 1985, 23rd row with 1986 and so on...

Hope, this answers your question

P.S. - You could use %in% operator to check against a range in the following way:

trial<-filter(babynames, year %in% 1985:2005)

Upvotes: 1

Related Questions