Reputation: 13
I have two text files file1.txt and file2.txt. I want to find the difference b/w the file which will highlight the equal, insertion and deletion text. The final goal is to create a html file which will have the text (equal, insertion and deletion text) highlighted with different color and styles.
file1.txt
I am testing this ruby code for printing the file diff.
file2.txt
I am testing this code for printing the file diff.
I am using this code
doc1 = File.open('file1.txt').read
doc2 = open('file2.txt').read
final_doc = Diffy::Diff.new(doc1, doc2).each_chunk.to_a
The output is :
-I am testing this ruby code for printing the file diff.
+I am testing this code for printing the file diff.
However, I need the output in similar to below format.
equal:
I am testing this
insertion:
ruby
equal:
code for printing the file diff.
In python there is a difflib through which it can be achieved but I have not found such functionality in the Ruby.
Upvotes: 1
Views: 2344
Reputation: 534
I've found there's a few different libraries in Ruby for doing "Diffs", but they're more focused on checking line by line. I created some code that is used to compare a couple of relatively short strings and show the differences, a sort of quick hack that works great if it doesn't matter too much about highlighting the removed sections in the parts that they were removed from - to do that would require just a bit more thinking about the algorith. But this code works wonders for a small amount of text at a time.
The key is, like with any language processing, getting your tokenization right. You can't just process a string word by word. Really the best way would be to first loop through, recursively, and associate each token with a position in the text and use that to do the analysis, but this method below is fast and easy.
def self.change_differences(text1,text2) #oldtext, newtext
result = ""
tokens = text2.split(/(?<=[?.!,])/) #Positive look behind regexp.
for token in tokens
if text1.sub!(token,"") #Yes it contained it.
result += "<span class='diffsame'>" + token + "</span>"
else
result += "<span class='diffadd'>" + token + "</span>"
end
end
tokens = text1.split(/(?<=[?.!,])/) #Positive look behind regexp.
for token in tokens
result += "<span class='diffremove'>"+token+"</span>"
end
return result
end
Source: me!
Upvotes: 1