Sochima Biereagu
Sochima Biereagu

Reputation: 556

Rank of Partition with specific length

How can I determine the rank/index of a partition of integer n with length k?

For instance, if n=10 and k=3, the possible partitions (sorted in reverse lexicographic order) are:

0 (8, 1, 1)
1 (7, 2, 1)
2 (6, 3, 1)
3 (6, 2, 2)
4 (5, 4, 1)
5 (5, 3, 2)
6 (4, 4, 2)
7 (4, 3, 3)

and I want to know the index of a specific partition, such as [5, 3, 2].

What is an efficient method to obtain this index without generating all the partitions?

I've tried using lehmer codes to no avail, I've also tried writing a helper function num_partitions(n, k, p) which returns the number of partitions of n with k parts and the largest part not exceeding p

def num_partitions(n, k, p):
  if n < 0 or k < 0 or p <= 0:
    return 0
  if n == 0 and k == 0:
    return 1
  return (partition_count_p(n - p, k - 1, p)
          + partition_count_p(n, k, p - 1))

But i just can't seem to fully wrap my head around it, perhaps a literature i am not aware of 🤔

Upvotes: 4

Views: 188

Answers (3)

Sochima Biereagu
Sochima Biereagu

Reputation: 556

I eventually figured this out, figured i should post as a separate answer so it may help someone else that comes across this.

So it's inspired by this: Find the lexicographic order of an integer partition, but instead of using p(n, k) which returns count of partitions with at most k parts, we use the variation that only returns the count of partitions with length k:

def p(n, k):
  '''Return total number of partition for integer n having length k'''
  if n == 0 and k == 0:
    return 1
  if k == 1 or n == k:
    return 1
  if k > n:
    return 0
  return p(n-1, k-1) + p(n-k, k)

def num_partitions(n, k, p):
  '''Returns the number of partitions of n with k parts and the largest part not exceeding p'''
  if n < 0 or k < 0 or p <= 0:
    return 0
  if n == 0 and k == 0:
    return 1
  return (num_partitions(n - p, k - 1, p)
          + num_partitions(n, k, p - 1))

Then to compute the rank, we simply do (inspired by [1]):

def partition_rank(arr):
  n = _n = sum(arr)
  k = _k = len(arr)
  r = 0
  for v in arr:
    r += num_partitions(n, k, v-1)
    n -= v
    k -= 1
  return p(_n, _k) - r - 1

Test:

arr = [(8, 1, 1), (7, 2, 1), (6, 3, 1), (6, 2, 2), (5, 4, 1), (5, 3, 2), (4, 4, 2), (4, 3, 3)]
for partition in arr:
  print(partition, partition_rank(partition))

Output:

(8, 1, 1) 0
(7, 2, 1) 1
(6, 3, 1) 2
(6, 2, 2) 3
(5, 4, 1) 4
(5, 3, 2) 5
(4, 4, 2) 6
(4, 3, 3) 7

You could easily employ dynamic programming to make the computation more efficient.

Upvotes: 1

Joseph Wood
Joseph Wood

Reputation: 7608

In the RcppAlgos* package for R, there are many ranking functions including for ranking partitions with the caveat that ranking/unranking is done lexicographically.

partitionsRank from RcppAlgos

It appears that the OP's original source vector is v = 1:8, we are allowed to have repetition, and the target is 10. To find the corresponding rank for c(5, 3, 2), we have the following (N.B. R starts indexing at 1, sometimes called 1-based):

library(RcppAlgos)
partitionsRank(c(5, 3, 2), v = 1:8, repetition = TRUE, target = 10)
#> [1] 6

Flexibility

These functions are flexible and very efficient (They are written in C++ under the hood).

Let's look at a different example. Restricted partitions of 100 of length 10 with distinct parts (N.B. repetition = FALSE is the default).

First let's generate and explore:

## Results are immediate
system.time(parts100 <- partitionsGeneral(100, 10))
#>    user  system elapsed 
#>   0.001   0.000   0.000

## The first 6
head(parts100)
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    2    3    4    5    6    7    8    9    55
#> [2,]    1    2    3    4    5    6    7    8   10    54
#> [3,]    1    2    3    4    5    6    7    8   11    53
#> [4,]    1    2    3    4    5    6    7    8   12    52
#> [5,]    1    2    3    4    5    6    7    8   13    51
#> [6,]    1    2    3    4    5    6    7    8   14    50

## the last 6
tail(parts100)
#>          [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [33396,]    5    6    7    8    9   10   11   12   14    18
#> [33397,]    5    6    7    8    9   10   11   12   15    17
#> [33398,]    5    6    7    8    9   10   11   13   14    17
#> [33399,]    5    6    7    8    9   10   11   13   15    16
#> [33400,]    5    6    7    8    9   10   12   13   14    16
#> [33401,]    5    6    7    8    9   11   12   13   14    15

Now, let's filter for the 1st, 10th, 100th, 1000th, and 10000th partitions:

test <- parts100[c(1, 10, 100, 1000, 10000), ]

## Print them
test
#>      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> [1,]    1    2    3    4    5    6    7    8    9    55
#> [2,]    1    2    3    4    5    6    7    8   18    46
#> [3,]    1    2    3    4    5    6    7   12   28    32
#> [4,]    1    2    3    4    5    7    8   18   21    31
#> [5,]    1    2    3    6    8   11   12   13   21    23

And finally we rank them (N.B. when the target is not explicitly given, we use the max(v) = 100 in this case):

partitionsRank(test, v = 1:100)
#> [1]     1    10   100  1000 10000

Large Cases

RcppAlgos utilizes the gmp library for handling cases when the number of partitions is large. Observe:

partitionsCount(10000, 10, TRUE)
Big Integer ('bigz') :
#> [1] 771441672987331681548480

## Generate a reproducible sample for testing. Note we use
## namedSample = TRUE to return the lexicographic index
bigSamp <- partitionsSample(10000, 10, TRUE, n = 3,
                            seed = 42, namedSample = TRUE)
bigSamp
#>                          [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
#> 356581031796290627391690   67  219  435  487  814  847  876 1187 1841  3227
#> 346735250799296299713370   65  106  391  544  599  622  926 1097 1949  3701
#> 273267181434457516383461   48  139  145  510  615  628  958 1144 1309  4504

## Confirm we have integer partitions of 10000
rowSums(bigSamp)
#> 356581031796290627391690 346735250799296299713370 273267181434457516383461 
#>                    10000                    10000                    10000

## Now we rank:
partitionsRank(bigSamp, v = 10000, repetition = TRUE)
#> Big Integer ('bigz') object of length 3:
#> [1] 356581031796290627391690 346735250799296299713370 273267181434457516383461

* I am the author of RcppAlgos

Upvotes: 0

Matt Timmermans
Matt Timmermans

Reputation: 59303

The index of a partition is the number of reverse-lexically smaller partitions. You can divide this number into two parts: all partitions with a smaller last number, and smaller partitions with the same last number.

def partition_index(list):
   k = len(list)
   n = sum(list)

   # every number is guaranteed > this
   # we are partitioning the excess
   toosmall = 0

   index = 0

   while (k>1):
       last = list[k-1] - toosmall

       # count partitions of n into k parts with smaller last
       if last > 1:
           # you already wrote this function
           index += num_partitions(n,k,last-1)

       # loop to count smaller partitions with the same last number
       k -= 1
       n -= last

       # since we only accept partitions in decreasing order...
       toosmall += (last-1)
       n -= k*(last-1)

    return index

Upvotes: 2

Related Questions