Reputation: 39
I have a dataframe 'my_data' which looks like this:
Calendar_Day Name
2018-03-31 ABC
2018-03-31 XYZ
2018-03-31 OPR
2019-01-31 ABC
2019-01-31 RTE
2019-10-31 YUD
2018-03-31 RYT
I wish to have another column that will serve as a primary key with a format
YEAR+MONTH+6digit sequence , eg: 201803000001
I am new to R and couldn't find a way to implement the concept.
and Dataframe should look like
Calendar_Day Name ID
2018-03-31 ABC 201803000001
2018-03-31 XYZ 201803000002
2018-03-31 OPR 201803000003
2019-01-31 ABC 201901000001
2019-01-31 RTE 201901000002
2019-10-31 YUD 201910000001
2018-03-31 RYT 201803000004
Upvotes: 1
Views: 456
Reputation: 11981
you could use the tidyverse package like this:
library(tidyverse)
mydata %>%
mutate(Date2 = format(Date, "%Y%m")) %>%
group_by(Date2) %>%
mutate(ID = paste0(Date2, str_pad(1:n(), width = 6, side = "left", pad = "0"))) %>%
ungroup() %>%
select(-Date2)
The main idea is to use the format
function: format(mydate, %Y)
returns the year of a date object and format(mydate, %m)
returns the month of a date object.
I paste these two together and add the six digit sequence.
I use string_pad
to add leading zeros to the sequence.
Upvotes: 1
Reputation: 32538
library(dplyr)
library(lubridate)
d %>%
mutate(Date = ymd(Date)) %>%
group_by(tmp1 = year(Date), tmp2 = month(Date)) %>%
mutate(ID = paste0(year(Date),
sprintf("%02d", month(Date)),
sprintf("%05d", row_number()))) %>%
ungroup() %>%
select(-tmp1, -tmp2)
#> # A tibble: 7 x 3
#> Date Name ID
#> <date> <chr> <chr>
#> 1 2018-03-31 ABC 20180300001
#> 2 2018-03-31 XYZ 20180300002
#> 3 2018-03-31 OPR 20180300003
#> 4 2019-01-31 ABC 20190100001
#> 5 2019-01-31 RTE 20190100002
#> 6 2019-10-31 YUD 20191000001
#> 7 2018-03-31 RYT 20180300004
Upvotes: 3