user3786126
user3786126

Reputation: 11

R extract numbers from a string

string would be

"-042-195" "+143-192" "-001*145" "#045+125" "#125$" 

How do I extract the last set of numbers?

"195" "192" "145" "125" "125"

Upvotes: 0

Views: 2860

Answers (4)

lawyeR
lawyeR

Reputation: 7654

Feels like piling on, but this works on the given sample:

v1 <- c("-042-195","+143-192","-001*145", "#045+125", "#125$")
v1 <- gsub("\\D", "", v1)
v1 <- substr(x = v1, start = nchar(v1)-2, nchar(v1))

Gsub out everything but digits, then call substr to keep the final three digits

Upvotes: 0

akrun
akrun

Reputation: 886938

Try:

 v1 <- c("-042-195","+143-192","-001*145", "#045+125", "#125$")
 library(stringr)
 str_extract(v1, perl("(?<=[^0-9])[0-9]+(?=[^0-9]?$)"))
 #[1] "195" "192" "145" "125" "125"

Explanation

 (?<=[^0-9]) #look behind for all except numbers
 [0-9]+ #followed by numbers
 (?=[^0-9]?$ #look ahead for all except numbers if present near the end

Or

  sapply(str_extract_all(v1, "\\d+"),tail,1)
 #[1] "195" "192" "145" "125" "125"

Or

 library(stringi)
  stri_extract_last(v1,regex="\\d+")
 #[1] "195" "192" "145" "125" "125"

Upvotes: 2

G. Grothendieck
G. Grothendieck

Reputation: 269421

1) sub If ch is the input vector of character strings then use sub with a regular expression matching anything up to a non-digit ("\\D") followed by the digits ("\\d+") followed by anything else (".*") and return the matched digits:

sub(".*\\D(\\d+).*", "\\1", paste(" ", ch))
## [1] "195" "192" "145" "125" "125"

If we were guaranteed that the numerics are preceeded by at least one non-numeric, which is the case for the example in the question, then paste(" ", ch) could be simplified to just ch:

sub(".*\\D(\\d+).*", "\\1", ch) 
## [1] "195" "192" "145" "125" "125"

2) strapplyc strapplyc in the gsubfn package matches the indicated regular expression allowing a simpler regular expression than above:

library(gsubfn)

sapply(strapplyc(ch, "\\d+"), tail, 1)
## [1] "195" "192" "145" "125" "125"

2a) strapply or use strapply (no c at the end) with as.numeric to return numbers:

sapply(strapply(ch, "\\d+", as.numeric), tail, 1)
## [1] 195 192 145 125 125

Upvotes: 2

jdharrison
jdharrison

Reputation: 30425

Probably better ways but you can strsplit and then take the last element of the result

> sapply(strsplit(myData, "-|[*]|[+]|#|[$]"), tail, n = 1)
[1] "195" "192" "145" "125" "125"

or replace all non alpha-numeric characters and then split and take last numbers

> sapply(strsplit(gsub("[^[:alnum:] ]", "&", myData), "&"), tail, n = 1)
[1] "195" "192" "145" "125" "125"

Upvotes: 2

Related Questions