Reputation: 311
Problem: I am using R and stringr and I have a very long regular expression using the "or" operator that I save to an object and use with stringr. How can I break it up into multiple lines in R so I do not have to keep scrolling to the right in my source editor? When I try commas, only the first line is recognized. Most answers to this question have been for other programming languages (i.e. not R).
regex_of_sites <- "side|southeast|north|computer|engineer|first|south|pharm|left|southwest|level|second|thirteenth"
Upvotes: 3
Views: 662
Reputation: 627190
Since you are using the pattern with stringr methods that use ICU regex flavor, you may use a (?x)
free spacing (also called verbose, or ignore pattern whitespace) modifier where all unescaped whitespace is ignored when compiling the pattern, and there is a possibility to add comments after an unescaped #
on each line (so, all literal #
must be escaped).
Here is an example:
> library(stringr)
> regex_of_sites <- "(?x)side # Term 0
+ |southeast # Term 1
+ |north # Term 1
+ |computer # Term 2
+ |engineer
+ |first
+ |south
+ |pharm
+ |left
+ |southwest
+ |level
+ |second
+ |thirteenth"
> str_extract_all("first level", regex_of_sites)
[[1]]
[1] "first" "level"
The same modifier is supported by the PCRE patterns used in base R regex functions with perl=TRUE
.
Upvotes: 6
Reputation: 206466
The regular expression is just a string. You can paste it together across multiple lines like any other string
regex_of_sites <- paste0("side|southeast|north|computer|engineer|",
"first|south|pharm|left|southwest|",
"level|second|thirteenth")
Upvotes: 4