Reputation: 3
I have a text field like this : -- :location: - '12.839006423950195' - '77.6580810546875' :last_location_update: 2015-08-10 16:41:46.817000000 Z
I want to extract 12.839006423950195 and 77.6580810546875 and put them into separate columns in the same data frame.
The length of these numbers vary - the only way to do it is by extracting what is nestled inside the first and second single quotation marks and third and fourth single quotation marks.
I tried using str_locate_all, str_match_all but I can't figure it our. Please help.
Thanks
Upvotes: 0
Views: 137
Reputation: 837
Without using any library it can be done like that:
txt <- ":location: - '12.839006423950195' - '77.6580810546875' :last_location_update: 2015-08-10 16:41:46.817000000 Z"
start<-gregexpr("('.*?)[0-9.](.*?')+",txt)[[1]]+1
end<-start+attr(start,"match.length")-3
df<-data.frame(t(apply(cbind(start[1:2],end[1:2]),1,function(x) substr(txt,x[1],x[2]))))
> df
X1 X2
1 12.839006423950195 77.6580810546875
Thanks to @thelatemail:
txt <- ":location: - '12.839006423950195' - '77.6580810546875' :last_location_update: 2015-08-10 16:41:46.817000000 Z"
df<-data.frame(t(regmatches(txt, gregexpr("(?<=')[0-9.]+(?=')",txt,perl=TRUE))[[1]]))
df
X1 X2
1 12.839006423950195 77.6580810546875
Upvotes: 0
Reputation: 887038
We can use str_extract_all
from library(stringr)
. We use regex lookarounds to match one or more numbers with decimals ([0-9.]+
) which is within the single quotes ((?<=')
and (?=')
).
library(stringr)
lst <- lapply(str_extract_all(txt, "(?<=')[0-9.]+(?=')") , as.numeric)
If we have the same length for list elements
df1 <- setNames(do.call(rbind.data.frame, lst), paste0('V', 1:2))
would get 2 column 'data.frame'
txt <- ":location: - '12.839006423950195' - '77.6580810546875' :last_location_update: 2015-08-10 16:41:46.817000000 Z"
Upvotes: 1