Reputation: 1773
I need to restrict the data based on two columns, based on first city and location name. I want to get all the rows for which the FirstPlace is 1 and the first city is London. Any suggestions on how I can do that? In this case, the example should display all rows for John as he lived in London for the first year.
year <- c(2008, 2009, 2010, 2009, 2010, 2011)
person <- c('John', 'John', 'John', 'Brian', 'Brian','Vickey')
location <- c('London','Paris', 'Newyork','Paris','Paris','Miami')
df <- data.frame(year, person, location)
library(dplyr)
df %>% group_by(person) %>% mutate(FirstPlace = +(min(year) == year))
Upvotes: 1
Views: 125
Reputation: 83215
Using data.table
:
library(data.table)
setDT(df)[order(year), if(first(location) == 'London') .SD, by = person]
which gives:
person year location 1: John 2008 London 2: John 2009 Paris 3: John 2010 Newyork
Or with dplyr
:
library(dplyr)
df %>%
arrange(year) %>%
group_by(person) %>%
filter(first(location) == 'London')
Upvotes: 3