Reputation: 41
Can you pls help in understanding output of regexpr? I am interested in text position that is 10 below. But it shows two values that is 10 and 4. How do I capture number 10 only.
Is this output a vector of numbers?
text<-"World is beautiful"
out<-regexpr("beau",text)
out
#[1] 10
#attr(,"match.length")
#[1] 4
#attr(,"useBytes")
#[1] TRUE
out[1]
#[1] 10
out[2]
#[1] NA
Upvotes: 4
Views: 2099
Reputation: 60000
out
is a length 1L
atomic vector with attributes...
str(out)
atomic [1:1] 10
- attr(*, "match.length")= int 4
- attr(*, "useBytes")= logi TRUE
The value of out
(try c(out)
to drop the attributes) is 10
which describes the start position in the character vector for a match to your pattern. attr( out , "match.length")
is
[1] 4
which describes the length of the match.
Your text
string is one element long, hence out
is one element long. Try regexpr("beau",rep(text,3))
.
Upvotes: 2
Reputation: 17189
From the help page of regexpr
. You can get it by typing ?regexpr
in R console.
regexpr returns an integer vector of the same length as text giving the starting position of the first match or -1 if there is none, with attribute "match.length", an integer vector giving the length of the matched text (or -1 for no match). The match positions and lengths are in characters unless useBytes = TRUE is used, when they are in bytes. If named capture is used there are further attributes "capture.start", "capture.length" and "capture.names".
Upvotes: 0