archenoo
archenoo

Reputation: 149

How to vectorize searching function and Intersection in Matlab?

Here is a Matlab coding problem (A little different version with setdiff not intersect here):

a rating matrix A with 3 cols, the 1st col is user'ID which maybe duplicated, 2nd col is the item'ID which maybe duplicated, 3rd col is rating from user to item, ranging from 1 to 5.

Now, I have a subset of user IDs smallUserIDList and a subset of item IDs smallItemIDList, then I want to find the rows in A that rated by users in smallUserIDList, and collect the items that user rated, and do some calculations, such as intersect with smallItemIDList and count the result, as the following code does:

userStat = zeros(length(smallUserIDList), 1);
for i = 1:length(smallUserIDList)
    A2= A(A(:,1) == smallUserIDList(i), :);
    itemIDList_each = unique(A2(:,2));

    setIntersect = intersect(itemIDList_each , smallItemIDList);
    userStat(i) = length(setIntersect);
end
userStat

Finally, I find the profile viewer showing that the loop above is inefficient, the question is how to improve this piece of code with vectorization but the help of for loop?

For example:

Input:

A = [
1 11 1
2 22 2
2 66 4
4 44 5
6 66 5
7 11 5
7 77 5
8 11 2
8 22 3
8 44 3
8 66 4
8 77 5    
]

smallUserIDList = [1 2 7 8]
smallItemIDList = [11 22 33 55 77]

Output:

userStat =

 1
 1
 2
 3

Upvotes: 1

Views: 91

Answers (1)

Divakar
Divakar

Reputation: 221744

Ah! You need a tiny edit in the accepted solution to the previous question. Here's the solution -

[R,C] = find(bsxfun(@eq,A(:,1),smallUserIDList(:).')); %//'
mask = ismember(A(R,2),smallItemIDList(:).'); %//'# The edit was needed here

ARm = A(R,2);
Cm = C(mask);
ARm = ARm(mask);

userStat = zeros(numel(smallUserIDList),1);
if ~isempty(Cm)
    dup_counts = accumarray(Cm,ARm,[],@(x) numel(x)-numel(unique(x)));
    accums = accumarray(C,mask);
    userStat(1:numel(accums)) = accums;
    userStat(1:numel(dup_counts)) = userStat(1:numel(dup_counts)) - dup_counts;
end

As a bonus stuff, you can edit the pre-allocation step -

userStat = zeros(numel(smallUserIDList),1);

with this much faster pre-allocation scheme -

userStat(1,numel(smallUserIDList)) = 0;

Read more about it in this MATLAB Undocumented post on Pre-allocation.

Upvotes: 1

Related Questions