Seydou GORO
Seydou GORO

Reputation: 1285

How to calculate the number of months from the initial date for each individual

This is a representation of my dataset

ID<-c(rep(1,10),rep(2,8))

year<-c(2007,2007,2007,2008,2008,2009,2010,2009,2010,2011,
        2008,2008,2009,2010,2009,2010,2011,2011)
month<-c(2,7,12,4,11,6,11,1,9,4,3,6,7,4,9,11,2,8)

mydata<-data.frame(ID,year,month)

I want to calculate for each individual the number of months from the initial date. I am using two variables: year and month.

I firstly order years and months:

mydata2<-mydata%>%group_by(ID,year)%>%arrange(year,month,.by_group=T) 

Then I created the variable date considering that the day begin with 01:

mydata2$date<-paste("01",mydata2$month,mydata2$year,sep = "-")

then I used lubridate to change this variable in date format

mydata2$date<-dmy(mydata2$date)

But after this, I really don't know what to do, in order to have such a dataset (preferably using dplyr code) below:

  ID year month       date dif_from_init
1   1 2007     2  01-2-2007             0
2   1 2007     7  01-7-2007             5
3   1 2007    12 01-12-2007            10
4   1 2008     4  01-4-2008            14
5   1 2008    11 01-11-2008            21
6   1 2009     1  01-1-2009            23
7   1 2009     6  01-6-2009            28
8   1 2010     9  01-9-2010            43
9   1 2010    11 01-11-2010            45
10  1 2011     4  01-4-2011            50
11  2 2008     3  01-3-2008             0
12  2 2008     6  01-6-2008             3
13  2 2009     7  01-7-2009            16
14  2 2009     9  01-9-2009            18
15  2 2010     4  01-4-2010            25
16  2 2010    11 01-11-2010            32
17  2 2011     2  01-2-2011            35
18  2 2011     8  01-8-2011            41

Upvotes: 0

Views: 98

Answers (1)

Onyambu
Onyambu

Reputation: 79188

One way could be:

mydata %>%
  group_by(ID) %>%
  mutate(date = as.Date(sprintf('%d-%d-01',year, month)), 
         diff = as.numeric(round((date - date[1])/365*12)))

# A tibble: 18 x 5
# Groups:   ID [2]
      ID  year month date        diff
   <dbl> <dbl> <dbl> <date>     <dbl>
 1     1  2007     2 2007-02-01     0
 2     1  2007     7 2007-07-01     5
 3     1  2007    12 2007-12-01    10
 4     1  2008     4 2008-04-01    14
 5     1  2008    11 2008-11-01    21
 6     1  2009     6 2009-06-01    28
 7     1  2010    11 2010-11-01    45
 8     1  2009     1 2009-01-01    23
 9     1  2010     9 2010-09-01    43
10     1  2011     4 2011-04-01    50
11     2  2008     3 2008-03-01     0
12     2  2008     6 2008-06-01     3
13     2  2009     7 2009-07-01    16
14     2  2010     4 2010-04-01    25
15     2  2009     9 2009-09-01    18
16     2  2010    11 2010-11-01    32
17     2  2011     2 2011-02-01    35
18     2  2011     8 2011-08-01    41

Upvotes: 1

Related Questions