Canovice
Canovice

Reputation: 10163

strsplit not consistently working, character between letters isn't a space?

The problem is very simple, but I'm having no luck fixing it. strsplit() is a fairly simple function, and I am surprised I am struggling as much as I am:

# temp is the problem string. temp is copy / pasted from my R code.
# i am hoping the third character, the space, which i think is the error, remains the error 
temp = "GS PG"

# temp2 is created in stackoverflow, using an actual space
temp2 = "GS PG"

unlist(strsplit(temp, split = " "))
[1] "GS PG"
unlist(strsplit(temp2, split = " "))
[1] "GS" "PG"

.
even if it doesn't work here with me trying to reproduce the example, this is the issue i am running into. with temp, the code isn't splitting the variable on the space for some odd reason. any thoughts would be appreciated!

Best,

EDIT - my example failed to recreate the issue. For reference, temp is being created in my code by scraping code from online with rvest, and for some reason it must be scraping a different character other than a normal space, i think? I need to split these strings by space though.

Upvotes: 4

Views: 1951

Answers (2)

ode2k
ode2k

Reputation: 2723

Try the following:

unlist(strsplit(temp, "\\s+"))

The "\\s+" is a regex search for any type of whitespace instead of just a standard space.

Upvotes: 8

USER_1
USER_1

Reputation: 2469

As in the comment,

It is likely that the "space" is not actually a space but some other whitespace character. Try any of the following to narrow it down:

whitespace <- c(" ", "\t" , "\n", "\r", "\v", "\f")
grep(paste(whitespace,collapse="|"), temp)

Related question here: How to remove all whitespace from a string?

Upvotes: 0

Related Questions