Reputation: 5512
I have a column filename
in a dataframe that looks like this:
/testData/THQ/TAIRATE.20030314.190000.tif
/testData/THQ/TAIRATE.20030314.200000.tif
/testData/THQ/TAIRATE.20030314.210000.tif
/testData/THQ/TAIRATE.20030314.220000.tif
And I want to extract the timestamp from this and store it as another column. But I am not familiar with Regex. So far I have gotten to this:
tdat %>%
dplyr::rowwise() %>%
dplyr::mutate(timestamp = str_extract(as.character(filename), "[^//TAIRATE]+$")) %>%
glimpse()
.20030314.190000.tif
.20030314.200000.tif
.20030314.210000.tif
.20030314.220000.tif
20030314190000
20030314200000
20030314210000
20030314220000
Question: How can I write the correct regex or is there a better way?
Upvotes: 0
Views: 211
Reputation: 21400
Certainly less elegant than @akrun's solution but this one works too:
paste0(unlist(str_extract_all(filename, "[0-9]+")), collapse = "")
Data:
filename <- "/testData/THQ/TAIRATE.20030314.190000.tif"
Upvotes: 1
Reputation: 388982
str_extract
and other such functions are vectorized you don't need row-wise.
In this case, you can do this in base R using sub
.
sub('.*TAIRATE\\.(\\d+)\\.(\\d+).*', '\\1\\2', df$filename)
#[1] "20030314190000" "20030314200000" "20030314210000" "20030314220000"
Upvotes: 1