Mridang Agarwalla
Mridang Agarwalla

Reputation: 44958

How do I remove a non-breaking space in Ruby

I have a string that looks like this:

d = "foo\u00A0\bar"

When I check the length, it says that it is 7 characters long. I checked online and found out that it is a non-breaking space. Could someone show me how to remove all the non-breaking spaces in a string?

Upvotes: 34

Views: 15577

Answers (4)

NobodysNightmare
NobodysNightmare

Reputation: 3123

In case you do not care about the non-breaking space specifically, but about any "special" unicode whitespace character that might appear in your string, you can replace it using the POSIX bracket expression for whitespace:

s.gsub(/[[:space:]]/, '')

These bracket expressions (as opposed to matchers like \s) do not only match ASCII characters, but all unicode characters of a class.

For more details see the ruby documentation

Upvotes: 46

Bart C
Bart C

Reputation: 1547

It's an old thread but maybe it helps somebody. I found myself looking for a solution to the same problem when I discovered that strip doesn't do the job. I checked with method ord what the character was and used chr to represent it in gsub

2.2.3 :010 > 160.chr("UTF-8")
 => " " 
2.2.3 :011 > 160.chr("UTF-8").strip
 => " " 
2.2.3 :012 > nbsp = 160.chr("UTF-8")
 => " " 
2.2.3 :013 > nbsp.gsub(160.chr("UTF-8"),"")
 => ""

I couldn't understand why strip doesn't remove something that looked like a space to me so I checked here what ASCII 160 actually is.

Upvotes: 9

Jonathan Stray
Jonathan Stray

Reputation: 2835

d.gsub("\u00A0", "") does not work in Ruby 1.8. Instead use d.gsub(/\302\240/,"")

See http://blog.grayproductions.net/articles/understanding_m17n for lots more on the character encoding differences between 1.8 and 1.9.

Upvotes: 2

user24359
user24359

Reputation:

irb(main):001:0> d = "foo\u00A0\bar"
=> "foo \bar"
irb(main):002:0> d.gsub("\u00A0", "")
=> "foo\bar"

Upvotes: 38

Related Questions