Reputation: 1399
I have a list of txt files stored in A.path
that I would like to use grep
on to find the year associated with that file, and save this year to a vector. However, as some of these txt files have multiple years in their text, I would only like to store the first year. How can I do this?
I've done similar things using lapply
, and this is how I began approaching this problem:
lapply(A.path, function(i){
j <- paste0(scan(i, what = character(), comment.char='', quote=NULL), collapse = " ")
year <- vector()
year[i] <- grep('[0-9][0-9][0-9][0-9]', j)
})
grep
probably isn't the right function to use, as this returns the entirety of j
for each i
. What is the right function to use here?
Upvotes: 5
Views: 1198
Reputation: 32456
Converting comment to answer, you can use gsub
with \\1
to extract the value of the first match (ie. the text matched between ()
in the regex)
gsub(".*?([0-9]{4}).*", "\\1", j)
Upvotes: 5