compare different id's in a column in r

Question

I have a column in a df that has different id's. Some of the id's are duplicated. I am trying to compare the different id's (starting from first) and then see if the same id is present in the next line (row) of the column. If it is the same id then i do something and if not go to the next id and repeat the same. Here is the column in the df

     V4
Contig1401|m.3412
Contig1428|m.3512
Contig1755|m.4465
Contig1755|m.4465
Contig1897|m.4878
Contig1897|m.4878
Contig1757|m.4476
Contig1598|m.4011
Contig1759|m.4481
Contig1685|m.4244

As you can see that there are id's that are duplicated and some are not. How do i go about it? So far i have written this.....

first_id <- "Contig1401|m.3412"

    for (i in data$V4) {
      if (i=first_id) {
        do something.....
      } else {
        do something.
      }
    }

But i don't understand ho will i go after this. Basically i want to obtain this

       V4          V5
Contig1401|m.3412  1
Contig1428|m.3512  1
Contig1755|m.4465  2
Contig1755|m.4465  
Contig1897|m.4878  2
Contig1897|m.4878
Contig1757|m.4476  1
Contig1598|m.4011  1
Contig1759|m.4481  1
Contig1685|m.4244  1

Any ideas of how i can do this?

Thanks Upendra

user20650 · Accepted Answer

Not sure if this do what you want but this produces your final table

df <- read.table(text="Contig1401|m.3412
Contig1428|m.3512
Contig1755|m.4465
Contig1755|m.4465
Contig1897|m.4878
Contig1897|m.4878
Contig1757|m.4476
Contig1598|m.4011
Contig1759|m.4481
Contig1685|m.4244",header=F,  stringsAsFactors=FALSE)

# One way
df$id <- duplicated(df$V1 , fromLast=T) + 1 
df$id[duplicated(df$V1) ] <- NA

#or
df$id <- rep(rle(df$V1)$lengths,rle(df$V1)$lengths)
df$id[duplicated(df$V1) ] <- NA

compare different id's in a column in r

Answers (2)

Edit to address OP's comment regarding entries that show up > 2 times

Related Questions

compare different id&#39;s in a column in r

Answers (2)

Edit to address OP's comment regarding entries that show up > 2 times

Related Questions

compare different id's in a column in r