Reputation: 2306
I am trying to make a 2D array out of words and sentences and then make another 2d array that matches it but with translation to English.
Here is the callback from the Lesson model that happens when I create a new lesson:
before_create do |lesson|
require 'rmmseg'
require "to_lang"
require "bing_translator"
lesson.parsed_content =[]
lesson.html_content = []
RMMSeg::Dictionary.load_dictionaries
text = lesson.content
text = text.gsub("。","^^.")
text = text.gsub("?","~~?")
text = text.gsub("!", "||!")
text = text.split(/[.?!]/u) #convert to an array
text.each do |s|
s.gsub!("^^","。")
s.gsub!("~~","?")
s.gsub!("||","!")
end
text.each_with_index do |val, index|
algor = RMMSeg::Algorithm.new(text[index])
splittext = []
loop do
tok = algor.next_token
break if tok.nil?
tex = tok.text.force_encoding('UTF-8')
splittext << tex
text[index] = splittext
end
end
lesson.parsed_content = text
textarray = text
translator = BingTranslator.new(BING_CLIENT_ID, BING_API_KEY)
ToLang.start(GOOGLE_TRANSLATE_API)
textarray.each_with_index do |sentence, si| #iterate array of sentence
textarray[si] = []
sentence.each_with_index do |word,wi| #iterate sentence's array of words
entry = DictionaryEntry.find_by_simplified(word) #returns a DictionaryEntry object hash
if entry == nil #for cases where there is no DictionaryEntry
textarray[si] << word
else
textarray[si] << entry.definition
end
end
lesson.html_content = textarray
end
end
Why are my variables lesson.parsed_content
and lesson.html_content
ending up equal to each other?
I was expecting lesson.parsed_content
to be Chinese and lesson.html_content
to be English, but they both end up being English. I am probably too tired, but I can't see why lesson.parsed_content
ends up English too.
Upvotes: 0
Views: 74
Reputation: 434945
You're referencing the same array in both of them:
lesson.parsed_content = text
textarray = text
# Various in-place modifications of textarray...
lesson.html_content = textarray
Just doing lesson.parsed_content = text
doesn't duplicate text
, it just copies the reference so you end up with four things pointing at the same piece of data:
text ------------------=-+--+--+----> [ ... ]
lesson.parsed_content -=-/ | |
lesson.html_content ---=----/ |
textarray -------------=-------/
Each assignment simply adds another pointer to the same underlying array.
You can't fix this problem with a simple lesson.parsed_content = text.dup
because dup
only does a shallow copy and that won't duplicate the inner arrays. Since you know that you have an array-of-arrays, you could dup
the outer and inner arrays by hand to get a full copy or you could use one of the standard deep copying approaches such as a round trip through Marshal. Or skip the copying altogether, iterate over textarray
but modify a separate array.
Upvotes: 4