Reputation: 2757
I have time-stamps in one column of my dataframe. They look like
"Tue May 14 21:57:04 +0000 2013"
I want to replace the whole timestamp with only month name. How can I do it in R? Lets say the column name is "timestamp" and dataframe name is "Df".
Below is the sample of some more entries.
"Wed Jul 10 01:30:36 +0000 2013"
"Fri Apr 20 01:46:59 +0000 2012"
"Sat Jul 07 17:56:34 +0000 2012"
"Sat Mar 16 02:12:30 +0000 2013"
"Sat Feb 16 02:29:11 +0000 2013"
I want these to look like
Jul
Apr
Jul
Mar
Feb
Your help will be highly appreciated.
Upvotes: 1
Views: 3681
Reputation: 269654
1) The month name is always in character positions 5 through 7 inclusive of the timestamp
column so this replaces the timestamp
column with a character solumn of months:
transform(DF, timestamp = format(substr(timestamp, 5, 7)))
The output is:
timestamp
1 Jul
2 Apr
3 Jul
4 Mar
5 Feb
2) If you wanted a factor column instead then use this variation which ensures that the factor levels are Jan=1, Feb=2, etc. rather than being assigned alphabetically:
transform(DF, timestamp = factor(substr(timestamp, 5, 7), levels = month.abb))
Note: We have assumed input in the following reproducible form:
DF <- data.frame(timestamp = c("Fri Apr 20 01:46:59 +0000 2012",
"Sat Feb 16 02:29:11 +0000 2013", "Sat Jul 07 17:56:34 +0000 2012",
"Sat Mar 16 02:12:30 +0000 2013", "Wed Jul 10 01:30:36 +0000 2013"))
Upvotes: 0
Reputation: 887168
We can use sub
. Match one or more non-white space characters(\\S+
) followed by one or more white space (\\s+
), then capture the non-white space as a group ((\\S+)
) followed by characters until the end of the string and replace it with the backreference (\\1
) for the captured group.
sub("\\S+\\s+(\\S+).*", "\\1", v1)
#[1] "May" "Jul" "Apr" "Jul" "Mar" "Feb"
It may be better to use DateTime conversions (as @DirkEddelbuettel mentioned in the comments) if we know how to get the format
correct.
v1 <- c("Tue May 14 21:57:04 +0000 2013", "Wed Jul 10 01:30:36 +0000 2013",
"Fri Apr 20 01:46:59 +0000 2012", "Sat Jul 07 17:56:34 +0000 2012",
"Sat Mar 16 02:12:30 +0000 2013", "Sat Feb 16 02:29:11 +0000 2013")
Upvotes: 2
Reputation: 368251
Assign the source data using Akrun's string
R> dates <- c("Tue May 14 21:57:04 +0000 2013", "Wed Jul 10 01:30:36 +0000 2013",
"Fri Apr 20 01:46:59 +0000 2012", "Sat Jul 07 17:56:34 +0000 2012",
"Sat Mar 16 02:12:30 +0000 2013", "Sat Feb 16 02:29:11 +0000 2013")
R> dates
[1] "Tue May 14 21:57:04 +0000 2013"
[2] "Wed Jul 10 01:30:36 +0000 2013"
[3] "Fri Apr 20 01:46:59 +0000 2012"
[4] "Sat Jul 07 17:56:34 +0000 2012"
[5] "Sat Mar 16 02:12:30 +0000 2013"
[6] "Sat Feb 16 02:29:11 +0000 2013"
R>
Parse using the appropriate strptime
format:
R> pt <- strptime(dates, "%a %b %d %H:%M:%S +0000 %Y")
R> pt
[1] "2013-05-14 21:57:04 CDT" "2013-07-10 01:30:36 CDT"
[3] "2012-04-20 01:46:59 CDT" "2012-07-07 17:56:34 CDT"
[5] "2013-03-16 02:12:30 CDT" "2013-02-16 02:29:11 CST"
R>
Re-format just the desired month
R> strftime(pt, "%m")
[1] "05" "07" "04" "07" "03" "02"
R> strftime(pt, "%b")
[1] "May" "Jul" "Apr" "Jul" "Mar" "Feb"
R> strftime(pt, "%B")
[1] "May" "July" "April" "July" "March"
[6] "February"
R>
Upvotes: 7
Reputation: 5152
Assuming your timestamp
is text:
df<-data.frame(timestamp=c("Tue May 14 21:57:04 +0000 2013",
"Fri Apr 20 01:46:59 +0000 2012",
"Sat Mar 16 02:12:30 +0000 2013"),stringsAsFactors = F)
df$month<-sapply(df$timestamp,function(sx)strsplit(sx,split=" ")[[1]][2])
df
> df
timestamp month
1 Tue May 14 21:57:04 +0000 2013 May
2 Fri Apr 20 01:46:59 +0000 2012 Apr
3 Sat Mar 16 02:12:30 +0000 2013 Mar
Upvotes: 2
Reputation: 388982
You can use strptime
along with format
.
Assuming you have characters, we can first convert it into "POSIXlt" "POSIXt"
format and then extracting the month (%b
) part of it
format(strptime(x, "%a %b %d %H:%M:%S +0000 %Y"), "%b")
#[1] "Jul" "Apr" "Jul" "Mar" "Feb"
Upvotes: 3