Grigor
Grigor

Reputation: 11

Find cardinality of set

I have faced the following problem recently:

We have a sequence A of M consecutive integers, beginning at A[1] = 1: 1,2,...M (example: M = 8 , A = 1,2,3,4,5,6,7,8 )

We have the set T consisting of all possible subsequences made from L_T consecutive terms of A. (example L_T = 3 , subsequences are {1,2,3},{2,3,4},{3,4,5},...). Let's call the elements of T "tiles".

We have the set S consisting of all possible subsequences of A that have length L_S. ( example L_S = 4, subsequences like {1,2,3,4} , {1,3,7,8} ,...{4,5,7,8} ).

We say that an element s of S can be "covered" by K "tiles" of T if there exist K tiles in T such that the union of their sets of terms contains the terms of s as a subset. For example, subsequence {1,2,3} is possible to cover with 2 tiles of length 2 ({1,2} and {3,4}), while subsequnce {1,3,5} is not possible to "cover" with 2 "tiles" of length 2, but is possible to cover with 2 "tiles" of length 3 ({1,2,3} and {4,5,6}).

Let C be the subset of elements of S that can be covered by K tiles of T.

Find the cardinality of C given M, L_T, L_S, K.

Any ideas would be appreciated how to tackle this problem.

Upvotes: 1

Views: 574

Answers (1)

Gassa
Gassa

Reputation: 8846

Assume M is divisible by T, so that we have an integer number of tiles covering all elements of the initial set (otherwise the statement is currently unclear).

First, let us count F (P): it will be almost the number of subsequences of length L_S which can be covered by no more than P tiles, but not exactly that. Formally, F (P) = choose (M/T, P) * choose (P*T, L_S). We start by choosing exactly P covering tiles: the number of ways is choose (M/T, P). When the tiles are fixed, we have exactly P * T distinct elements available, and there are choose (P*T, L_S) ways to choose a subsequence.

Well, this approach has a flaw. Note that, when we chose a tile but did not use its elements at all, we in fact counted some subsequences more than once. For example, if we fixed three tiles numbered 2, 6 and 7, but used only 2 and 7, we counted the same subsequences again and again when we fixed three tiles numbered 2, 7 and whatever.

The problem described above can be countered by a variation of the inclusion-exclusion principle. Indeed, for a subsequence which uses only Q tiles out of P selected tiles, it is counted choose (M-Q, P-Q) times instead of only once: Q of P choices are fixed, but the other ones are arbitrary.

Define G (P) as the number of subsequences of length L_S which can be covered by exactly P tiles. Then, F (P) is sum for Q from 0 to P of the products G (Q) * choose (M-Q, P-Q). Working from P = 0 upwards, we can calculate all the values of G by calculating the values of F. For example, we get G (2) from knowing F (2), G (0) and G (1), and also the equation connecting F (2) with G (0), G (1) and G (2).

After that, the answer is simply sum for P from 0 to K of the values G (P).

Upvotes: 1

Related Questions