user3012614
user3012614

Reputation: 25

Add specific column value for first unique value in other column in R

I want to add a column to a data frame based on the values in an other column. I want a specific value for the first time a value appears in the other column only. For example:

 s <- c(6,5,6,7,8,7,6,5)
 i <- c(4,5,4,3,2,3,4,5)
 t <- c(1,1,3,4,5,6,6,8)
 df<- data.frame(t,s,i)
 > df
   t s i
 1 1 6 4
 2 1 5 5
 3 3 6 4
 4 4 7 3
 5 5 8 2
 6 6 7 3
 7 6 6 4
 8 8 5 5

Now I want to add a column "mark" that gives a 1 for the first time t=1 and the first time t=6. So that I get: 1 0 0 0 0 1 0 0. I have this code:

    for(i in 1:nrow(df)){
         if (df$t[i] == 1 & df$t[i-1] != 1 | (df$t[i] == 6 & df$t[i-1] != 6)){
              df$mark[i] <- 1
         } else {
              df$mark[i] <- 0
         }
    }

This however gives the following error:

     Error in if (df$t[i] == 1 & df$t[i - 1] != 1 | (df$t[i] == 6 & df$t[i -  :argument is of length zero

Can anyone tell me what is going wrong?

Upvotes: 1

Views: 68

Answers (2)

akrun
akrun

Reputation: 886948

  within(df, mark<- (c(1,diff(t %in% c(1,6)))==1) +0)
   #     t s i mark
   #   1 1 6 4    1
   #   2 1 5 5    0
   #   3 3 6 4    0
   #   4 4 7 3    0
   #   5 5 8 2    0
   #   6 6 7 3    1
   #   7 6 6 4    0
   #   8 8 5 5    0

Or

  duplicated(df$t,fromLast=T) +0
  #[1] 1 0 0 0 0 1 0 0

Upvotes: 0

David Arenburg
David Arenburg

Reputation: 92282

Don't use loops, just do

df$mark <- 0
df$mark[match(c(1, 6), df$t)] <- 1

from ?match documentation

match returns a vector of the positions of (first) matches of its first argument in its second.

The reason you are getting an error in your loop is because you are looping from 1 to nrow(df). But in your loop you are specifying df$t[i-1], which basically means df$t[0] in your first iteration; which is a non-existing entry

Upvotes: 1

Related Questions