Reputation: 459
I have a collection of arrays of the form,
a = [1 2 3 5 6 8 9 10]
b = [1 2 3 4 7 14]
c = [1 2 3 4 5 6 7 8 15 19 20]
That is, each one is a collection of non-repeating numbers from 1 to M (say M is 20 for the example, could be much larger in practice though). In general, I will have many such arrays (many more than 3, but I don't know before-hand exactly how many, likely on the order of 5000-10000) with duplicates (that is, array b may show up multiple times).
Objective: I want to store the arrays in some object, call it X, that keeps track of how many instances of each array it includes. Furthermore, when faced with a new array, we should be able to search X and increment the count of that array in the object, or if it is not in the array, add it to the object (with a count of 1).
Question: What is an efficient way to achieve the objective in Matlab?
What I've tried so far:
I was thinking about converting the arrays to logical arrays, for example,
a = [1 1 1 0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0];
then maybe converting the above to a binary number a_bin and to index a cell array X that stores the number, that is, X{a_bin} stores the number of times a has appeared. This seems to scale poorly though since a_bin can get very large for large M.
Upvotes: 0
Views: 149
Reputation: 1390
Use cell array to keep them in one place, and just search (= compare with each) if this new array is already in the main cell array. No need to complicate. Mark each element of cell array with array of positions where it is already present in the main cell array. And on new duplicate, update all of them.
Anyway, have you estimated memory use? 20 elements/array * 10k arrays * 8B/element = 2MB. This is small.
Upvotes: 1