firmo23
firmo23

Reputation: 8404

Extract month and year from a POSIXct column to create a new column

I have the data below:

pr5915<-structure(list(activity = c("Forged Wire, Medium (Sport)", "Forged Wire, Medium (Sport)", 
"Forged Wire, Medium (Sport)", "Forged Wire, Medium (Sport)", 
"BBH-1&2", "Forged Wire, Medium (Sport)", "TCE Cleaning", "SOLO Oil", 
"Tempering", "TCE Cleaning", "Tempering", "BBH-1&2", "BBH-1&2", 
"TCE Cleaning", "Tempering"), timestamp = structure(c(-62042010952, 
-62010323872, -61884118792, -61821087232, -61821025132, -61789534972, 
-61789520332, -61789494952, -61757940232, -61757925652, -61757895952, 
-61694827552, -61694808292, -61663337152, -61663307452), tzone = "Europe/Istanbul", class = c("POSIXct", 
"POSIXt"))), row.names = c(NA, -15L), class = c("eventlog", "log", 
"tbl_df", "tbl", "data.frame"), case_id = "case_id", activity_id = "activity", activity_instance_id = "action", lifecycle_id = "lifecycle", resource_id = "resource", timestamp = "timestamp")

enter image description here

and I want to create 1 new column named Date which will extract the month and the Year from the timestamp column. For example if month is 12 and year 20 then the new column will have "December-20".

***formated dataset

pr5915<-structure(list(case_id = c("WC4120721", "WC4120667", "WC4120689"
), lifecycle = c(110, 110, 110), action = c("WC4120721-CN354877", 
"WC4120667-CN354878", "WC4120689-CN356752"), activity = c("Forged Wire, Medium (Sport)", 
"Forged Wire, Medium (Sport)", "Forged Wire, Medium (Sport)"), 
    resource = c("3419", "3216", "3409"), timestamp = structure(c(1606964400, 
    1607115480, 1607435760), tzone = "", class = c("POSIXct", 
    "POSIXt"))), row.names = c(NA, -3L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 1

Views: 379

Answers (1)

Martin C. Arnold
Martin C. Arnold

Reputation: 9668

The following does the trick using functions form tidyverse and lubridate:

library(tidyverse)
library(lubridate)

pr5915 %>% 
 mutate(
  timestamp = ymd_hms(
    as_datetime(
     as.character(timestamp), format = "00%d-%m-%y %H:%M:%S")
    ),
  Date = paste0(
   month(timestamp, label = T, abbr = F), 
   "-", 
   substr(year(timestamp), 3 , 4)
  )
)

Outcome

# A tibble: 15 × 3
   activity                    timestamp           Date       
   <chr>                       <dttm>              <chr>      
 1 Forged Wire, Medium (Sport) 2020-12-03 06:00:00 December-20
 2 Forged Wire, Medium (Sport) 2020-12-04 23:58:00 December-20
 3 Forged Wire, Medium (Sport) 2020-12-08 16:56:00 December-20
 4 Forged Wire, Medium (Sport) 2020-12-10 05:42:00 December-20
 5 BBH-1&2                     2020-12-10 22:57:00 December-20
 6 Forged Wire, Medium (Sport) 2020-12-11 10:13:00 December-20
 7 TCE Cleaning                2020-12-11 14:17:00 December-20
 8 SOLO Oil                    2020-12-11 21:20:00 December-20
 ...

For the formatted dataset you need to alter the format argument to match the dttm format: format = "%Y-%m-%d %H:%M:%S"

Upvotes: 3

Related Questions