Reputation: 149
Here is a Matlab coding problem (A little different version with setdiff not intersect here):
a rating matrix A with 3 cols, the 1st col is user'ID which maybe duplicated, 2nd col is the item'ID which maybe duplicated, 3rd col is rating from user to item, ranging from 1 to 5.
Now, I have a subset of user IDs smallUserIDList and a subset of item IDs smallItemIDList, then I want to find the rows in A that rated by users in smallUserIDList, and collect the items that user rated, and do some calculations, such as intersect with smallItemIDList and count the result, as the following code does:
userStat = zeros(length(smallUserIDList), 1);
for i = 1:length(smallUserIDList)
A2= A(A(:,1) == smallUserIDList(i), :);
itemIDList_each = unique(A2(:,2));
setIntersect = intersect(itemIDList_each , smallItemIDList);
userStat(i) = length(setIntersect);
end
userStat
Finally, I find the profile viewer showing that the loop above is inefficient, the question is how to improve this piece of code with vectorization but the help of for loop?
For example:
Input:
A = [
1 11 1
2 22 2
2 66 4
4 44 5
6 66 5
7 11 5
7 77 5
8 11 2
8 22 3
8 44 3
8 66 4
8 77 5
]
smallUserIDList = [1 2 7 8]
smallItemIDList = [11 22 33 55 77]
Output:
userStat =
1
1
2
3
Upvotes: 1
Views: 91
Reputation: 221744
Ah! You need a tiny edit in the accepted solution to the previous question. Here's the solution -
[R,C] = find(bsxfun(@eq,A(:,1),smallUserIDList(:).')); %//'
mask = ismember(A(R,2),smallItemIDList(:).'); %//'# The edit was needed here
ARm = A(R,2);
Cm = C(mask);
ARm = ARm(mask);
userStat = zeros(numel(smallUserIDList),1);
if ~isempty(Cm)
dup_counts = accumarray(Cm,ARm,[],@(x) numel(x)-numel(unique(x)));
accums = accumarray(C,mask);
userStat(1:numel(accums)) = accums;
userStat(1:numel(dup_counts)) = userStat(1:numel(dup_counts)) - dup_counts;
end
As a bonus stuff, you can edit the pre-allocation step -
userStat = zeros(numel(smallUserIDList),1);
with this much faster pre-allocation scheme -
userStat(1,numel(smallUserIDList)) = 0;
Read more about it in this MATLAB Undocumented post on Pre-allocation
.
Upvotes: 1