Reputation: 348
I found a very strange behavior in strsplit()
. It's similar to this question, however I would love to know why it is returning an empty element in the first place. Does someone know?
unlist(strsplit("88F5T7F4T13F", "\\d+"))
[1] "" "F" "T" "F" "T" "F"
Since I use that string vor reproducing a long logical vector (88*FALSE 5*TRUE 7*FALSE 4*TRUE 13*FALSE) I have to trust it...
Answer unlist(strsplit("88F5T7F4T13F", "\\d+"))[-1]
works, but is it robust?
Upvotes: 3
Views: 541
Reputation: 626861
The empty element appears since there are digits at the start. Since you split at digits, the first split occurs right between start of string and the first F
and that empty string at the string start is added to the resulting list.
You may use your own solution since it is already working well. If you are interested in alternative solutions, see below:
unlist(strsplit(sub("^\\d+", "", "88F5T7F4T13F"), "\\d+"))
It makes the empty element in the resulting split disapper since the sub
with ^\d+
pattern removes all leading digits (^
is the start of string and \d+
matches 1 or more digits). However, it is not robust, since it uses 2 regexps.
library(stringr)
res = str_extract_all(s, "\\D+")
This only requires one matching regex, \D+
- 1 or more non-digit symbols, and one external library.
If you want to do a similar thing with base R, use regmatches
with gregexpr
:
regmatches(s, gregexpr("\\D+", s))
Upvotes: 1