Pulling out numbers before a phrase

Question

I'm struggling to use regex so any insight would be helpful. I have a list like this:

[1] "collected 1 hr total. wind >15 mph."   "collected 4 hr total. 
wind ~15 mph."  
[3] "collected 10 hr total. gusts 5-10 mph." "collected 1 hr total. 
breeze at 1mph," 
[5] "collected 2 hrs."    [6]

I want:

 [1] > 15 mph
 [2] ~15 mph
 [3] 5-10 mph
 [4] 1mph
 [5] 
 [6]

And I want to pull out wind speed in each row. Can you suggest the correct regex expression? As you can see, a) there can be a variable number of spaces between the digits & "mph" b) the digits before mph can have different symbols, ">","<", "~" or can be an interval "-"

Thank you in advance!

akrun · Accepted Answer

One option with str_extract

library(stringr)
trimws(str_extract(v1, "[>~]?[0-9- ]+mph"))
#[1] ">15 mph"   "~15 mph"   "5-10 mph" "1mph"     NA

data

v1 <- c("collected 1 hr total. wind >15 mph.", 
   "collected 4 hr total. wind ~15 mph.", 
 "collected 10 hr total. gusts 5-10 mph.", 
 "collected 1 hr total. breeze at 1mph,", 
  "collected 2 hrs.")

Pulling out numbers before a phrase

Answers (2)

data

Related Questions