Reputation: 1238
Having a dataframe like this:
data.frame(id = c(1,3), timestamp = c("20-10-2009 11:35:12", "01-01-2017 12:21:21"), stringAsFactor = FALSE)
How is it possible to keep only year in the timestamp column having in mind that all years are after 2000? An expected output:
data.frame(id = c(1,3), timestamp = c("2009", "2017"), stringAsFactor = FALSE)
Upvotes: 0
Views: 65
Reputation: 21400
Base R:
format(as.Date(df$timestamp, "%d-%m-%Y %H:%M:%S"), "%Y")
[1] "2009" "2017"
So in the dataframe:
df$year <- format(as.Date(df$timestamp, "%d-%m-%Y %H:%M:%S"), "%Y")
id timestamp year
1 1 20-10-2009 11:35:12 2009
2 3 01-01-2017 12:21:21 2017
Another option, if you're into or familiar with regex, is this:
sub(".*([0-9]{4}).*", "\\1", df$timestamp)
[1] "2009" "2017"
Upvotes: 4
Reputation: 163
I have a tidyverse solution to your problem:
library(tidyverse)
data.frame(id = c(1,3), timestamp = c("20-10-2009 11:35:12", "01-01-2017 12:21:21"), stringAsFactor = FALSE) %>%
mutate(timestamp = timestamp %>%
str_extract("\\d{4}"))
The function str_extract("\\d{4}")
should always extract the first four digits of your target variable.
Upvotes: 1
Reputation: 545
See if this answers your question. The code and the output is as follows :-
library(lubridate)
library(tidyverse)
df <- data.frame(id = c(1,3,4), timestamp = c("20-10-2009 11:35:12", "01-01-2017 12:21:21","01-01-1998 12:21:21"), stringAsFactor = FALSE)
df$timestamp <- dmy_hms(df$timestamp)
df1 <- df %>%
filter(year(timestamp) > 2000) %>%
mutate(new_year = year(timestamp))
df1
#id timestamp stringAsFactor new_year
#1 1 2009-10-20 11:35:12 FALSE 2009
#2 3 2017-01-01 12:21:21 FALSE 2017
Upvotes: 2
Reputation: 606
If you're not afraid of external packages, one option would be to make use of the lubridate package:
df <- data.frame(id = c(1,3), timestamp = c("20-10-2009 11:35:12", "01-01-2017 12:21:21"))
df <- df %>%
mutate(timestamp = lubridate::dmy_hms(timestamp)) %>%
mutate(year = lubridate::year(timestamp))
Obviously, if you actually want to replace the timestampe column, you have to change the last mutate command. Result:
id timestamp year
1 1 2009-10-20 11:35:12 2009
2 3 2017-01-01 12:21:21 2017
Upvotes: 1