A M
A M

Reputation: 53

How to select just first elements in a row of the same values in R?

i have two vectors

 x<-c(1,1,1,2,2,2,3,3,4,5,5,6)
 y<-c("a","b","c","d","e","f","g","h","i","j","k","l")

I'd like to choose only those in y, which coincide with the first element in a sequence of equal values in x So, in my case the final solution should be like this:

    y x
 1  a 1
 4  d 2
 7  g 3
 9  i 4
 10 j 5
 12 l 6

We wrote a script and it works but need to add and delete additional rows (to use cbind later) which is mess for me.

aaa<-data.frame(y,x)
df<-NULL
for (i in 2:length(aaa$x)){   # you may see it stars from the second element because 
                               # of x[i-1]
   bbb<-ifelse((aaa$x[i]!= aaa$x[i-1]), aaa$x[i], NA)
     df<-rbind(df,bbb)
}
df
df<-rbind(1,df)
aaa$x<-df[,1]
bbb<-na.omit(aaa)
bbb

I have tried to apply rle() as it was recommended me earlier How to choose non-interruped numbers only? but failed in this case.

would love to hear your recommendation,

thank you.

Upvotes: 0

Views: 423

Answers (2)

Sven Hohenstein
Sven Hohenstein

Reputation: 81693

Here's a simple solution:

aaa <- data.frame(y, x)
aaa[!duplicated(aaa$x), ]

#    y x
# 1  a 1
# 4  d 2
# 7  g 3
# 9  i 4
# 10 j 5
# 12 l 6

Upvotes: 3

Sherlock
Sherlock

Reputation: 5627

Maybe something like this:

justFirst <- function(x, y){
    stopifnot(length(x) > 1 && length(y) > 1 && length(x) == length(y))
    newX <- newY <- vector()
    for (i in 1:length(x)){
        if (i == 1){
            newX <- c(newX, x[i])
            newY <- c(newY, y[i])
        }
        else{
            if (x[i] != x[i-1]){
                newX <- c(newX, x[i])
                newY <- c(newY, y[i])
            }
        }
    }
    return(data.frame(newX, newY))
}

x<-c(1,1,1,2,2,2,3,3,4,5,5,6)
y<-c("a","b","c","d","e","f","g","h","i","j","k","l")

justFirst(x, y)

I'm putting this here as an alternative, in case there is a situation whereby x and y look like this (unordered and messy):

x<-c(1,1,1,2,1,2,3,3,4,5,5,6,7,4,4,4,4,3,2,3)
y<-c("a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t")

But, maybe there is still a better way of handling this situation ...

Upvotes: 1

Related Questions