Hassan Saqib
Hassan Saqib

Reputation: 2757

How to extract Month name from Timestamp in R?

I have time-stamps in one column of my dataframe. They look like

"Tue May 14 21:57:04 +0000 2013"

I want to replace the whole timestamp with only month name. How can I do it in R? Lets say the column name is "timestamp" and dataframe name is "Df".

Below is the sample of some more entries.

    "Wed Jul 10 01:30:36 +0000 2013"    
    "Fri Apr 20 01:46:59 +0000 2012"    
    "Sat Jul 07 17:56:34 +0000 2012"    
    "Sat Mar 16 02:12:30 +0000 2013"    
    "Sat Feb 16 02:29:11 +0000 2013"

I want these to look like

Jul
Apr
Jul
Mar
Feb

Your help will be highly appreciated.

Upvotes: 1

Views: 3681

Answers (5)

G. Grothendieck
G. Grothendieck

Reputation: 269654

1) The month name is always in character positions 5 through 7 inclusive of the timestamp column so this replaces the timestampcolumn with a character solumn of months:

transform(DF, timestamp = format(substr(timestamp, 5, 7)))

The output is:

  timestamp
1       Jul
2       Apr
3       Jul
4       Mar
5       Feb

2) If you wanted a factor column instead then use this variation which ensures that the factor levels are Jan=1, Feb=2, etc. rather than being assigned alphabetically:

transform(DF, timestamp = factor(substr(timestamp, 5, 7), levels = month.abb))

Note: We have assumed input in the following reproducible form:

DF <- data.frame(timestamp = c("Fri Apr 20 01:46:59 +0000 2012", 
  "Sat Feb 16 02:29:11 +0000 2013", "Sat Jul 07 17:56:34 +0000 2012", 
  "Sat Mar 16 02:12:30 +0000 2013", "Wed Jul 10 01:30:36 +0000 2013"))

Upvotes: 0

akrun
akrun

Reputation: 887168

We can use sub. Match one or more non-white space characters(\\S+) followed by one or more white space (\\s+), then capture the non-white space as a group ((\\S+)) followed by characters until the end of the string and replace it with the backreference (\\1) for the captured group.

sub("\\S+\\s+(\\S+).*", "\\1", v1)
#[1] "May" "Jul" "Apr" "Jul" "Mar" "Feb"

It may be better to use DateTime conversions (as @DirkEddelbuettel mentioned in the comments) if we know how to get the format correct.

data

v1 <- c("Tue May 14 21:57:04 +0000 2013", "Wed Jul 10 01:30:36 +0000 2013", 
"Fri Apr 20 01:46:59 +0000 2012", "Sat Jul 07 17:56:34 +0000 2012", 
"Sat Mar 16 02:12:30 +0000 2013", "Sat Feb 16 02:29:11 +0000 2013")

Upvotes: 2

Dirk is no longer here
Dirk is no longer here

Reputation: 368251

Assign the source data using Akrun's string

R> dates <- c("Tue May 14 21:57:04 +0000 2013", "Wed Jul 10 01:30:36 +0000 2013", 
              "Fri Apr 20 01:46:59 +0000 2012", "Sat Jul 07 17:56:34 +0000 2012", 
              "Sat Mar 16 02:12:30 +0000 2013", "Sat Feb 16 02:29:11 +0000 2013")
R> dates
[1] "Tue May 14 21:57:04 +0000 2013"
[2] "Wed Jul 10 01:30:36 +0000 2013"
[3] "Fri Apr 20 01:46:59 +0000 2012"
[4] "Sat Jul 07 17:56:34 +0000 2012"
[5] "Sat Mar 16 02:12:30 +0000 2013"
[6] "Sat Feb 16 02:29:11 +0000 2013"
R> 
Parse using the appropriate strptime format:
R> pt <- strptime(dates, "%a %b %d %H:%M:%S +0000 %Y")
R> pt
[1] "2013-05-14 21:57:04 CDT" "2013-07-10 01:30:36 CDT"
[3] "2012-04-20 01:46:59 CDT" "2012-07-07 17:56:34 CDT"
[5] "2013-03-16 02:12:30 CDT" "2013-02-16 02:29:11 CST"
R> 
Re-format just the desired month
R> strftime(pt, "%m")
[1] "05" "07" "04" "07" "03" "02"
R> strftime(pt, "%b")
[1] "May" "Jul" "Apr" "Jul" "Mar" "Feb"
R> strftime(pt, "%B")
[1] "May"      "July"     "April"    "July"     "March"   
[6] "February"
R> 

Upvotes: 7

Robert
Robert

Reputation: 5152

Assuming your timestamp is text:

df<-data.frame(timestamp=c("Tue May 14 21:57:04 +0000 2013",
                           "Fri Apr 20 01:46:59 +0000 2012",
                           "Sat Mar 16 02:12:30 +0000 2013"),stringsAsFactors = F)
df$month<-sapply(df$timestamp,function(sx)strsplit(sx,split=" ")[[1]][2])
df

> df
                       timestamp month
1 Tue May 14 21:57:04 +0000 2013   May
2 Fri Apr 20 01:46:59 +0000 2012   Apr
3 Sat Mar 16 02:12:30 +0000 2013   Mar

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388982

You can use strptime along with format.

Assuming you have characters, we can first convert it into "POSIXlt" "POSIXt" format and then extracting the month (%b) part of it

format(strptime(x, "%a %b %d %H:%M:%S +0000 %Y"), "%b")

#[1] "Jul" "Apr" "Jul" "Mar" "Feb"

Upvotes: 3

Related Questions