Reputation: 167
I want to split this string: test = "-1x^2+3x^3-x^8+1-x" ...into parts by plus and minus characters in R. My goal would be to get: "-1x^2" "+3x^3" "-x^8" "+1" "-x"
This didn't work:
strsplit(test, split = "-")
strsplit(test, split = "+")
Upvotes: 5
Views: 480
Reputation: 163277
In your examples, you use strsplit
with a plus and a minus sign which will split on every encounter.
You could assert that what is directly to the left is not either the start of the string or +
or -
, while asserting +
and -
directly to the right.
(?<!^|[+-])(?=[+-])
Explanation
(?<!
Negative lookabehind assertion
^
Start of string| Or
- [+-]
Match either +
or -
using a character class)
Close lookbehind(?=
Positive lookahead assertion
[+-]
Match either +
or -
)
Close lookaheadAs the pattern uses lookaround assertions, you have to use perl = T
to use a perl style regex.
Example
test <- "-1x^2+3x^3-x^8+1-x"
strsplit(test, split = "(?<!^|[\\s+-])(?=[+-])", perl = T)
Output
[[1]]
[1] "-1x^2" "+3x^3" "-x^8" "+1" "-x"
See a online R demo.
If there can also not be a space to the left, you can write the pattern as
(?<!^|[\\s+-])(?=[+-])
See a regex demo.
Upvotes: 4
Reputation: 269526
This uses gsub to search for any character followed by + or - and inserts a semicolon between the two characters. Then it splits on semicolon.
s <- "-1x^2+3x^3-x^8+1-x"
strsplit(gsub("(.)([+-])", "\\1;\\2", s), ";")[[1]]
## [1] "-1x^2" "+3x^3" "-x^8" "+1" "-x"
Upvotes: 5
Reputation: 101257
Try
> strsplit(test, split = "(?<=.)(?=[+-])", perl = TRUE)[[1]]
[1] "-1x^2" "+3x^3" "-x^8" "+1" "-x"
where (?<=.)(?=[+-])
captures the spliter that happens to be in front of +
or -
.
Upvotes: 5
Reputation: 16856
We can provide a regular expression in strsplit
, where we use ?=
to lookahead to find the plus or minus sign, then split on that character. This will allow for the character itself to be retained rather than being dropped in the split.
strsplit(x, "(?<=.)(?=[+])|(?<=.)(?=[-])",perl = TRUE)
# [1] "-1x^2" "+3x^3" "-x^8" "+1" "-x"
Upvotes: 7