Reputation: 1190
I am trying to extract the number at the beginning of a string in R. I have tried this:
> tt <- "51 - TS - Data estimated - see comments"
> grep('^[0-9]+', tt, value=T)
[1] "51 - TS - Data estimated - see comments"
Why is it returning the whole string and not just the number?
Upvotes: 4
Views: 93
Reputation: 269596
1) sub Try this which removes the first non-digit and everything thereafter:
> sub("\\D.*", "", tt)
[1] "51"
2) strsplit or this which splits on non-digits and takes the first such component:
> strsplit(tt, "\\D")[[1]][1]
[1] "51"
3) strapplyc or this which extracts the leading digits:
> library(gsubfn)
> strapplyc(tt, "^\\d+", simplify = TRUE)
[1] "51"
Upvotes: 2
Reputation: 193517
grep
returns either the position or the value (of the entire input) if a pattern is found.
Try gsub
or gregexpr
+regmatches
instead:
gsub("(^[0-9]+).*", "\\1", tt)
# [1] "51"
x <- gregexpr("^[0-9]+", tt)
regmatches(tt, x)
# [[1]]
# [1] "51"
Upvotes: 4