dsmalenb
dsmalenb

Reputation: 149

Matlab: How to remove cell elements which have other sets as subsets

I have a cell with arrays listed inside:

C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]}

I want to output:

D = {[3,4], [2], [4,5], [7]}

Those sets in D are the only sets that contain any other sets in D in themselves.

Please reference the following link for a similar question. Although elegant, I was not able to modify the code (yet) to accommodate my particular question.

I would appreciate any help with a solution.

Thank you!

Upvotes: 1

Views: 108

Answers (2)

rahnema1
rahnema1

Reputation: 15837

As of the linked post you can form the matrix s that represents the number of similar elements between all pairs of sets. The result would be:

C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]};
n = cellfun(@numel,C);      % find length of each element.
v = repelem(1:numel(C),n);  % generate indices for rows of the binary matrix
[~,~,u] = unique([C{:}]);   % generate indices for rows of the binary matrix
b = accumarray([v(:),u(:)],ones(size(v)),[],@max,[],true); % generate the binary matrix
s = b * b.';                % multiply by its transpose
s(1:size(s,1)+1:end) = 0;   % set diagonal elements to 0(we do not need self similarity)
result=C(~any(n(:) == s)) ;

But the matrix may be very large so it is better to use a loop to avoid memory problems:

idx=false(1,numel(C));
for k =1:numel(C)
    idx(k) = ~any(n == full(s(k, :))) ;
end
result=C(idx) ;

Or follow a vectorized approach:

[r, c, v] = find(s) ;
idx = sub2ind(size(s), r, c) ;
s(idx) = v.' == n(r) ;
result = C(~any(s)) ;

Upvotes: 3

If_You_Say_So
If_You_Say_So

Reputation: 1283

You could simply do this by comparing each element with its next elements and see if any of the next elements are a subset of the current element and if so remove the larger element. Here is simple code that does what you are looking for:

C = {[1,2,3,4], [3,4], [2], [4,5,6], [4,5], [7]};
% Initialize D with a copy of C
D = C; 

% Compare each element i with other elements j = i+1, i+2, ....
for i = 1:numel(C)-1
    for j = i+1:numel(C)
        % Check to see if C{j} exists in C{i}
        if contains(num2str(C{i}),num2str(C{j}))
            % Make unwanted elements empty
            D{i} = [];
        end
    end
end

% Remove empty elements
D(cellfun(@isempty,D)) = [];

Upvotes: 1

Related Questions