Reputation: 44958
I have a string that looks like this:
d = "foo\u00A0\bar"
When I check the length, it says that it is 7 characters long. I checked online and found out that it is a non-breaking space. Could someone show me how to remove all the non-breaking spaces in a string?
Upvotes: 34
Views: 15577
Reputation: 3123
In case you do not care about the non-breaking space specifically, but about any "special" unicode whitespace character that might appear in your string, you can replace it using the POSIX bracket expression for whitespace:
s.gsub(/[[:space:]]/, '')
These bracket expressions (as opposed to matchers like \s
) do not only match ASCII characters, but all unicode characters of a class.
For more details see the ruby documentation
Upvotes: 46
Reputation: 1547
It's an old thread but maybe it helps somebody.
I found myself looking for a solution to the same problem when I discovered that strip doesn't do the job. I checked with method ord
what the character was and used chr
to represent it in gsub
2.2.3 :010 > 160.chr("UTF-8")
=> " "
2.2.3 :011 > 160.chr("UTF-8").strip
=> " "
2.2.3 :012 > nbsp = 160.chr("UTF-8")
=> " "
2.2.3 :013 > nbsp.gsub(160.chr("UTF-8"),"")
=> ""
I couldn't understand why strip
doesn't remove something that looked like a space to me so I checked here what ASCII 160 actually is.
Upvotes: 9
Reputation: 2835
d.gsub("\u00A0", "")
does not work in Ruby 1.8. Instead use d.gsub(/\302\240/,"")
See http://blog.grayproductions.net/articles/understanding_m17n for lots more on the character encoding differences between 1.8 and 1.9.
Upvotes: 2
Reputation:
irb(main):001:0> d = "foo\u00A0\bar"
=> "foo \bar"
irb(main):002:0> d.gsub("\u00A0", "")
=> "foo\bar"
Upvotes: 38