Reputation: 6644
There are 2 groups of user. Based on their query I return some search results to them (a1,a2,a3). The search results could vary based on either the group that user belongs to or some user specific parameter. I want to measure, whether the search results to the users are significantly different to each other or not for the same query (let's say when there is difference of more than 7 results out of first 10 results).
Are there any real time/batch learning algorithm to do this?
Here is what i am planning so far,
Batch incoming events in in some time interval. Let's say 5 mins.
group all the response by (groupid, query). so that i will have list of records of the form
(query1, group1, r1,r2,r3,...,r10)
(query1, group2, r1,r4,r5,...,r11),
(query1, group1, r2,r1,r3,...,r9)
(query1, group2, r3,r4,r5,...,r11),
Calculate the frequency distribution of results by groupid for a given query.
(query1, group1): r1:5,r2:7,r3:10,r4:9 ... r11:10
(query1, group2): r1:3,r2:9,r3:11,r4:11 ... r11:1
Now measure how group1 and group2 are different from each other by using chi square distance.
I have few questions wrt this
Literature suggestion are also welcome.
Upvotes: 1
Views: 37