Indexing by two columns consistently

Question

How can I create a consistently column 'id' using two columns? I need to get an index column 'id' by 'Date' and 'Tr' respectively. I have data:

Date    Tr
1   20190829    1
2   20190829    1
3   20190829    1
4   20190829    1
5   20190829    2
6   20190829    2
7   20190830    2
8   20190830    2
9   20190830    2
10  20190830    2
11  20190830    1
12  20190830    1

Example data:

data<-data.frame(Date=c(rep(20190829,6),rep(20190830,6)),Tr=c(rep(1,4),rep(2,6),rep(1,2)))

I'm trying to use somthing like this:

data$id <- cumsum(c(TRUE, diff(data$Tr) != 0))

    Date    Tr  Id
1   20190829    1   1
2   20190829    1   1
3   20190829    1   1
4   20190829    1   1
5   20190829    2   2
6   20190829    2   2
7   20190830    2   2
8   20190830    2   2
9   20190830    2   2
10  20190830    2   2
11  20190830    1   3
12  20190830    1   3

But I need a result:

data2<-data.frame(Date=c(rep(20190829,6),rep(20190830,6)),Tr=c(rep(1,4),rep(2,6),rep(1,2)),id=c(rep(1,4),rep(2,2),rep(3,4),rep(4,2)))

    Date    Tr  Id
1   20190829    1   1
2   20190829    1   1
3   20190829    1   1
4   20190829    1   1
5   20190829    2   2
6   20190829    2   2
7   20190830    2   3
8   20190830    2   3
9   20190830    2   3
10  20190830    2   3
11  20190830    1   4
12  20190830    1   4

How can i do it using R? is there any package which would give quick results?

Sotos · Accepted Answer

An idea via base R can be,

i1 <- with(data, ave(Tr, Date, FUN = function(i) cumsum(c(TRUE, diff(i) != 0))))

cumsum(c(TRUE, diff(i1) != 0))
#[1] 1 1 1 1 2 2 3 3 3 3 4 4

#or add it to your data frame,

data$new_id <- cumsum(c(TRUE, diff(i1) != 0))

Indexing by two columns consistently

Answers (2)

Related Questions