user7055375
user7055375

Reputation:

Two strings being almost equal

I define that two strings are almost equal if:

  1. they have the same length, or
  2. their length differs by one, and the strings differ by a single character.

These two strings are almost equal:

HOW DO YOU
HO DO YOU

as are these:

abcdef
bcdef

But these strings are not almost equal:

Almost
Anost

nor are these:

Almost
Aomst

I have this function that I attempted to make it tell if two strings are almost equal:

def str_almost_equal(a, b)
  a.downcase == b.downcase || (a.size == b.size && a.downcase.chars.map.with_index{|c, i| c == b.downcase[i]}.count(false) == 1)
end

Calling the code above with "aaa" and "aab" evaluates to true.

How do I expand my function so that strings are almost equal if in addition to the above, the strings lengths differ by only one and the characters are identical except for one?

Upvotes: 1

Views: 1774

Answers (3)

Mark Thomas
Mark Thomas

Reputation: 37507

If order was not an issue, then you could just compute set difference of the characters:

def str_almost_equal(a, b)
  shortest, longest = [a.chars,b.chars].minmax_by(&:length)
  (longest - shortest).length == 1
end

However, your last test indicates that order is indeed significant. So this is more of a longest common subsequence problem with these characteristics:

  • The LCS must be equal to the smaller string (in other words, the small string must be completely contained within the larger string, but not necessarily consecutive)
  • The larger string must be larger by exactly one character

So, given an lcs function, you can do this:

def str_almost_equal(a, b)
  shortest, longest = [a,b].minmax_by(&:length)
  lcs(a,b) == shortest && longest.length - shortest.length == 1
end

You can find lcs functions at the above link. Here is one:

def lcs(xstr, ystr)
  return "" if xstr.empty? || ystr.empty?

  x, xs, y, ys = xstr[0..0], xstr[1..-1], ystr[0..0], ystr[1..-1]
  if x == y
    x + lcs(xs, ys)
  else
    [lcs(xstr, ys), lcs(xs, ystr)].max_by {|x| x.size}
  end
end

There is also the diff-lcs gem you can look into.

Upvotes: 0

Edwin Diaz
Edwin Diaz

Reputation: 1763

Try finding the intersection of the two strings.

check this link out here that provides the number of same characters between two strings.

You could check the length of the longest string, to the number of characters that they intersect.

E.g. if the longer string is of length n, the intersection should be n-1 to be "almost" equal

Upvotes: 1

Michael Chaney
Michael Chaney

Reputation: 3031

Use the "fuzzy-string-match" gem in your Gemfile:

gem 'fuzzy-string-match'

It's really easy to use:

2.2.7 :001 > require 'fuzzystringmatch'
 => true 
2.2.7 :002 >     jarow = FuzzyStringMatch::JaroWinkler.create(:pure)
 => #<FuzzyStringMatch::JaroWinklerPure:0x007fa08c4d8710> 
2.2.7 :003 > jarow.getDistance('Almost', 'Aomst')
 => 0.8900000000000001 
2.2.7 :004 > jarow.getDistance('Almost', 'Anost')
 => 0.8400000000000001 
2.2.7 :005 > jarow.getDistance('Almost', 'Almost')
 => 1.0 

I use it for fuzzy string matching and it's great. In my case, I'm matching file names against song titles, and I do a cartesian join (basically, match every filename against every title) and then take the top hits for each one, at least when they're past a certain threshold.

Upvotes: 1

Related Questions