Reputation: 36
I have a data frame of polls, one column of which is titled Date.s.administered
, and is formatted as a string containing the dates for which the poll was administered. For example, "January 16-20, 2019"
, or "December 1-11, 2018"
. The entire column looks like this:
[1] "November 3–5, 2018" "November 1–2, 2018"
[3] "October 28–30, 2018" "October 22–28, 2018"
[5] "October 15–28, 2018" "October 15–28, 2018"
[7] "October 25–26, 2018" "October 18–21, 2018"
[9] "October 15–21, 2018" "October 12–18, 2018"
[11] "October 10–14, 2018" "October 9–13, 2018"
[13] "October 9–13, 2018" "October 8–13, 2018"
[15] "October 8–11, 2018" "October 3–9, 2018"
How would I manipulate this column so that it displays only the last date of the series (for example, "March 1-4, 2018"
becomes "March 4, 2018"
)?
Upvotes: 0
Views: 50
Reputation: 40151
An approach using tidyverse
could be:
date %>%
separate(date, c("date1", "date2"), sep = "–") %>%
mutate(date = paste(sub("[^[:alpha:]]+", "", date1), date2, sep = " ")) %>%
select(date)
date
1 November 2, 2018
2 October 28, 2018
3 October 28, 2018
First, it is separating the "date" column into "date1" and date2" based on "–". Then, it is keeping only letters from "date1" and combining it with "date2" into the desired "date" column.
Sample data:
date <- data.frame(date = c("November 1–2, 2018",
"October 22–28, 2018",
"October 15–28, 2018"))
Upvotes: 0
Reputation: 922
You could use the lubridate
package in combination with regex to extract a string pattern then convert to a standard date field.
suppressPackageStartupMessages(library(lubridate))
x <- "March 1-4, 2018"
mdy(gsub("(^.+)(\\s\\d-)(\\d)(,\\s)(\\d{4}$)", '\\1 \\3 \\5', x))
#> [1] "2018-03-04"
Upvotes: 0
Reputation: 14764
You could do:
gsub("\\d+–", "", df$Date.s.administered)
Example data:
df <- data.frame(Date.s.administered = c("November 3–5, 2018", "November 1–2, 2018"))
Output:
[1] "November 5, 2018" "November 2, 2018"
Upvotes: 2