Wild Widow
Wild Widow

Reputation: 2549

algorithm to find a similarity score from facebook and twitter friends?

Using PHP I fetched the friends list from facebook and twitter and I stored each list in an associative array. I have both their names and locations. I want to do comparison of both the friends from fb and twitter based on their name and location, and provide with a similarity score.

Like I want to set a threshold of about 0.7, and if the score is more that that for a person, then it means that they represent the same entity. I have used the php function similar_text but it is too basic, it is giving a 50 - 60 % match for almost every friend, as it is just based on the words in the name.

Any suggestions?

Upvotes: 1

Views: 335

Answers (1)

Alec Martin
Alec Martin

Reputation: 126

You may want to consider the vector space model: represent each name and location as a dimension in a very high-dimensional space. Represent twitter as one vector, and facebook as another. If, for example, I have a friend named Mike on both facebook and twitter, the "Mike" dimension has a non-zero value in both vectors. By comparing the angle between these two vectors, I can compute as similarity score. A smaller angle indicates a higher degree of similarity. A simple example:

My twitter friends: Ada Alan Beth Dana Jon

My facebook friends: Anne Beth Dana Jon

Space contains dimensions: < Ada, Alan, Anne, Beth, Dana, Jon >

Twitter vector: t = < 1, 1, 0, 1, 1, 1 >

Facebook vector: f = < 0, 0, 1, 1, 1, 1 >

The angle between them is equal to ArcCos( [ f dot t ] / [ | f | * | t | ] )

See https://en.wikipedia.org/wiki/Vector_space_model

Upvotes: 1

Related Questions