Terence Chow
Terence Chow

Reputation: 11153

string has trailing whitespaces that aren't white spaces? (i.e. strip doesn't get rid of it)

I have the following string I got from parsing some html:

"this is my string  "

If I use .strip or .rstrip the string remains the same.

However if I literally type the string "this is my string " and type .strip then the trailing spaces get stripped.

This leads me to believe the string I obtained from parsing html is not containing trailing white spaces. So the question I have is, 1) what is trailing the string if it isn't a white space? and 2) how do I get rid of it?

Upvotes: 0

Views: 214

Answers (1)

Casimir et Hippolyte
Casimir et Hippolyte

Reputation: 89557

The unicode table contains several whitespace characters, and it is possible that all of these characters are not handle by the strip methods. If you want to use a regular expression with the sub method, you can try this simple pattern: /\p{Space}+\z/ or /[[:space:]]+\z/ to trim all the blank characters on the right. (obviously, the replacement string must be empty)

Note: the \s is equivalent to [ \t\r\n\f] in Ruby and doesn't contain all whitespaces of the unicode table.

Upvotes: 2

Related Questions