Anthoney Kalasho
Anthoney Kalasho

Reputation: 51

Check similarity of two texts by word?

I would like to check the similarity of two strings by word.

I tried using php.js similar_text:

http://phpjs.org/functions/similar_text/

But it checks similarity letter by letter so for example if I checked the similarity of "ddda" against "addd" it would return 100%

I would like a function that checks word by word so that "Hello World" checked against "Hello" would return 50%

Upvotes: 2

Views: 899

Answers (2)

Chedi Bechikh
Chedi Bechikh

Reputation: 173

the post is old but if you want to check word similarity or a fragment of text similarity, you can use explicit semantic semilarity, you ca reada paper about that https://www.jair.org/media/2669/live-2669-4346-jair.pdf

you can use this librabry on linux and it is very simple to use http://lukas.zilka.me/esalib/

Upvotes: 0

smnbbrv
smnbbrv

Reputation: 24571

I don't understand what you mean exactly by similarity, but you can try this:

var a = "hello world", b = "hello 123"

function similarity(a,b) {
  // splitting and sorting arrays (for easier and faster search)
  var arrayA = a.split(/\W/g).sort(),
      arrayB = b.split(/\W/g).sort(),
      result = 0

  // loop through a
  for (var i=0,imax=arrayA.length;i<imax;i++)
    // for every word find amount of occurences in text b
    result += arrayB.reduce(function(a,b){
                return a + (arrayA[i] == b?1:0)
              },0)

  // change here to your understanding of similarity
  return result/imax * 100
}

alert(similarity(a,b) + "%")

JS Bin - click edit on the right top corner

Probably, you would like to enhance it with some duplicates checker or whatever else, but this is the basis you can use for your further implementation

Upvotes: 3

Related Questions