webmagnets
webmagnets

Reputation: 2296

How can I fix this error: "incompatible character encodings: UTF-8 and ASCII-8BIT"?

I am trying to use the rmmseg-cpp gem's sample code documented here: http://rmmseg-cpp.rubyforge.org/#Stand-Alone-rmmseg

Just to test it out I put it in show.html.erb like this:

# coding: UTF-8
<p id="notice"><%= notice %></p>

<p>
  <b>Title:</b>
  <%= @lesson.title %>
</p>

<p>
  <b>Content:</b>
  <%= @lesson.content %> # simplified chinese text
</p>

<p><% require 'rmmseg' %>
<% algor = RMMSeg::Algorithm.new(@lesson.content) %>
<% loop do %>
  <% tok = algor.next_token %>
  <% break if tok.nil? %>
  <%= "#{tok.text} [#{tok.start}..#{tok.end}]" %>
<% end %> </p>

<%= link_to 'Edit', edit_lesson_path(@lesson) %> |
<%= link_to 'Back', lessons_path %>

I get the following error:

 Encoding::CompatibilityError in Lessons#show

Showing /Users/webmagnets/rails_projects/blt/app/views/lessons/show.html.erb where line #19 raised:

incompatible character encodings: UTF-8 and ASCII-8BIT

Extracted source (around line #19):

16: <% loop do %>
17:   <% tok = algor.next_token %>
18:   <% break if tok.nil? %>
19:   <%= "#{tok.text} [#{tok.start}..#{tok.end}]" %>
20: <% end %> </p>
21: 
22: <%= link_to 'Edit', edit_lesson_path(@lesson) %> |

Rails.root: /Users/webmagnets/rails_projects/blt
Application Trace | Framework Trace | Full Trace

app/views/lessons/show.html.erb:19:in `block in _app_views_lessons_show_html_erb___3831310028264182552_70339844987120'
app/views/lessons/show.html.erb:16:in `loop'
app/views/lessons/show.html.erb:16:in `_app_views_lessons_show_html_erb___3831310028264182552_70339844987120'
app/controllers/lessons_controller.rb:20:in `show'

Request

Parameters:

{"id"=>"1"}

Show session dump

Show env dump
Response

Headers:

None

If you need any more info, please let me know.

Upvotes: 0

Views: 2628

Answers (2)

webmagnets
webmagnets

Reputation: 2296

This link helped me: https://github.com/sinatra/sinatra/issues/559#issuecomment-7748296

I used <% text = tok.text.force_encoding 'UTF-8' %> and it worked.

Thanks @zed_0xff for putting me on the right path.

Upvotes: 5

zed_0xff
zed_0xff

Reputation: 33217

try this workaround

<% text = tok.text.encode('utf-8',:invalid => :replace, :undef => :replace) %>
<%= "#{text} [#{tok.start}..#{tok.end}]" %>

Upvotes: 1

Related Questions