Reputation: 325
We have an issue with our data structure in SQL Server as we are ordering by a calculated field. With this in mind we were wondering if it would be possible to use map reduce to prepare indexes to keep our responses snappy - specifically within RavenDB.
FYI - I asked this previously on DBA and got some great answers but wanted to consider this different approach:
I will try and give an example - this is not my table structure - I'm simply trying to outline the issue in order to find a solution...
Person Id, Name
BrothersNames Id, Name
SistersNames Id, Name
PersonBrothers (join table) PersonId, BrotherNameId
PersonSisters (join table) PersonId, SisterNameId
OK - so imagine this database holds every person from a small country. The database holds a record of the names of everyone's brothers and sisters (it does not map a person to their brother or sister - just their names) so that we can find out statistics about names.
Obviously lots of names are shared so currently in SQL Server the join tables normalise this for us.
What I want to do is take one user and find out the number of matches of brother's names and number of matches of sister's names with every other user in the system, then add those two matches together and order by that descending. So this would give us a list of users who have the most number of brothers and sister's names in common.
I'm really only interested in the top ten matches but I think I have to get the whole result set to work out the top ten matches.
Please note that in my actual data a person can have a million brothers or a milllion sisters. This is where I'm getting performance issues.
This is how I'm calculating the matches for brothers and I do the same for sisters
select p.id, matches
FROM Person p
LEFT JOIN
(
SELECT
COUNT(*) AS Matches,
pbn.PersonId
FROM PersonBrothersNames pbn
INNER JOIN Brothersnames bn on pbn.BrothernameId =bn.Id
inner join PersonBrothersName otherpbn on otherpbn.BrothernameId = bn.Id
WHERE pbn.PersonId= @PersonId and pbn.PersonId <> otherpbn.personid
GROUP BY pbn.PersonId
) As BrothersNamesJoin ON BrothersNamesJoin.Person = p.Id
Upvotes: 2
Views: 154
Reputation: 22956
What you can do is something like this:
{ "Name": "a", "Brothers": ["b","c"] }
Then you can index them, and use the More Like This bundle to search for the other people with similar brothers names. And yes, it will rank them for you.
Upvotes: 2