Lavoslav Caklovic
Lavoslav Caklovic

Reputation: 29

Strange behavior of strsplit() in R?

I would like to split the string x = "a,b," (comma at the last place) into the vector c("a","b","") using strsplit().

The result is:

>strsplit(x,',')
[[1]]
[1] "a" "b"

I would like the have the third component (empty string or NULL).

The function read.csv(x) can manage that, but still I think that strsplit() should behave as I expected. Python gives c("a","b","").

Maybe there is some option of strsplit() I do not know?

Upvotes: 2

Views: 637

Answers (1)

Spacedman
Spacedman

Reputation: 94202

That's how it works and is documented in help(strsplit):

 Note that this means that if there is a match at the beginning of
 a (non-empty) string, the first element of the output is ‘""’, but
 if there is a match at the end of the string, the output is the
 same as with the match removed.

You might want to use str_split from the stringr package:

> require(stringr)
> str_split("a,b,",",")
[[1]]
[1] "a" "b" "" 

> str_split("a,b",",")
[[1]]
[1] "a" "b"

> str_split(",a,b",",")
[[1]]
[1] ""  "a" "b"

> str_split(",a,b,,,",",")
[[1]]
[1] ""  "a" "b" ""  ""  "" 

Upvotes: 8

Related Questions