chris P
chris P

Reputation: 6589

Nokogiri converting   to ?. How can I get it to convert to a space

I open my doc like this doc = Nokogiri::HTML(open(team_url)) and later on I'm parsing through an HTML tables <td> elements.

In the HTML, there is often an element that looks like this

<td>&nbsp;</td>

When I do a

content = row.xpath("td[1]/text()")

I end up getting ? as a result for content, instead of a space.

Why is this, and how can I resolve it?

Upvotes: 1

Views: 615

Answers (1)

Rob Donnelly
Rob Donnelly

Reputation: 21

Nokogiri converts "&nbsp;" to no-break space unicode character. You can do a global substitution to resolve.

content.text.gsub("\u00A0", ' ') # replace &nbsp; with space

content.text.gsub("\u00A0", '') # remove &nbsp;

Upvotes: 2

Related Questions