T'n'E
T'n'E

Reputation: 616

Match letters in R regex

Suppose I run the following

txt <- "client:A, field:foo, category:bar"
grep("field:[A-z]+", txt, value = TRUE, perl = TRUE)

Based on regexr.com I expected I would get field:foo, but instead I get the entire string. Why is this?

Upvotes: 3

Views: 8128

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 627469

You seem to want to extract the value. Use regmatches:

txt <- "client:A, field:foo, category:bar"
regmatches(txt, regexpr("field:[[:alpha:]]+", txt))
# => [1] "field:foo"

See the R demo.

To match multiple occurrences, replace regexpr with gregexpr.

Or use stringr str_extract_all:

library(stringr)
str_extract_all(text, "field:[a-zA-Z]+")

Another point is that [A-z] matches more than ASCII letters. Use [[:alpha:]] in a TRE (regexpr / gregexpr with no perl=TRUE)/ICU (stringr) regex to match any letter.

Upvotes: 6

Related Questions