Sburg13
Sburg13

Reputation: 121

How can I use multiple conditionals and match to create a new variable?

I have the following data

Name <- c("Kobe Bryant", "Kobe Bryant", "Kobe Bryant", 
          "Kobe Bryant", "Kobe Bryant", "Kobe Bryant", 
          "Lebron James", "Lebron James", "Lebron James", 
          "Lebron James", "Kevin Durant", "Kevin Durant",
          "Kevin Durant", "Kevin Durant", "Kevin Durant")

Date <- as.Date(c("2015-05-14", "2015-05-15", "2015-05-19", "2015-05-21", 
           "2015-05-24", "2015-05-28", "2015-05-14", "2015-05-20", 
           "2015-05-21", "2015-05-23", "2015-05-22", "2015-05-24", 
           "2015-05-28", "2015-06-02", ""2015-06-04"))

df <- data.frame c(Name, Date)

Desired_output <- c(1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0)

df2 <- data.frame c(Name, Date, Desired_output)

I want to create a new column that identifies the back-to-back games (playing a game two consecutive days) for a specific player.

Output of the column: 1 (if b2b) 0 if not.

Both the first day and the second day of the b2b should have a 1.

Upvotes: 0

Views: 80

Answers (1)

Ben Bolker
Ben Bolker

Reputation: 226087

This is a split-apply-combine problem (since you need to handle each player separately), which you can do in base R (by(), aggregate(), ...) or with a variety of packages (plyr, dplyr, data.table) ... here's a plyr() solution.

Name <- rep(c("Kobe Bryant", "Lebron James", "Kevin Durant"),
            c(6,4,5))
Date <- as.Date(c("2015-05-14", "2015-05-15", "2015-05-19",
  "2015-05-21","2015-05-12", "2015-05-28", "2015-05-14",
  "2015-05-16","2015-05-17", "2015-05-21", "2015-05-22",
  "2015-05-24","2015-05-28","2015-06-02","2015-06-10"))
dd <- data.frame(Name,Date)
b2b <- function(x,ind=FALSE) {
    x0 <- head(x,-1)  ## all but last
    x1 <- tail(x,-1)  ## all but first
    comp <- abs(head(x,-1)-tail(x,-1))==1
    res <- c(comp,FALSE) | c(FALSE,comp)
    if (ind) {
        w <- res==1 & c(0,res[-length(res)])==1
        res[w] <- 2
    }
    return(res)
}
library("plyr")
ddply(dd,"Name",
      transform,
         b2b=as.numeric(b2b(Date)),
         b2b_ind=as.numeric(b2b(Date,ind=TRUE)))

My code has automatically reorganized the players by alphabetical order (because players got turned into a factor with levels in alphabetical order, and ddply returns the data in this rearranged order). If that's important you can make sure the factors are ordered the way you want before beginning.

Upvotes: 1

Related Questions