Kevin
Kevin

Reputation: 13

generate a special matrix (max value of column sum is minimum) with given number of column from a vector

Recently I come across such as a question: given a vector, one need generate a special matrix with given number of column. It should be pointed out that if the elements in the vector is not enough to fill in the generated matrix, then put 0 in the last row in the generated matrix. For the generated matrix, it has a requirement that need the maximum value of column sum is the minimum.

The following is the code for the given question:

x <- c(10, 10, 9, 21, 8, 3, 7, 23, 1, 5, 26)
x
ncol <- 4

x <- sort(x, decreasing = TRUE)
x

nx <- length(x)
nrow <- ceiling(nx / ncol)

# add 0 in the end
if (nx < nrow * ncol) {
  x <- c(x, rep(0, length.out = nrow * ncol - nx))
}
x

Here is the correct matrix, what I want!

# generate mat_a
mat_a <- rbind(c(26, 23, 21, 10),
               c(5, 7, 8, 10),
               c(0, 1, 3, 9))
mat_a
# max value of column sum
max(colSums(mat_a)) # 32

The matrix below is what I got, but it is wrong!

# generate mat_b
mat_b <- rbind(c(26, 23, 21, 10),
               c(7, 8, 9, 10),
               c(0, 1, 3, 5))
mat_b
max(colSums(mat_b)) # 33

Since max(colSums(mat_a)) < max(colSums(mat_b)), mat_a is the wanted matrix.

From the above code, we know that mat_a is the wanted matrix, since max(colSums(mat_a)) < max(colSums(mat_b)). I understand that there are many matrices can be generated from a vector, but the matrix with the above requirement is only one (or a little if same values are generated). It seems a combinatorial algorithms or dynamic programming problem, but unfortunately, I cannot figure out how to get the solution for the above question. I appreciate it you can provide some hints about the problem or give the effective solutions.

Upvotes: 1

Views: 402

Answers (1)

chinsoon12
chinsoon12

Reputation: 25225

Here is a possible greedy heuristic approach and probably works only where all values in x are non-negative.

Starting with the largest values first. Initialize the first row with the largest values. Then, add each subsequent non-zero value to the column with the smallest sum.

x <- c(10, 10, 9, 21, 8, 3, 7, 23, 1, 5, 26)
x <- sort(x, decreasing=TRUE)
nc <- 4L
nr <- ceiling(length(x) / 4)

#Initialize the first row with the largest values
y <- c(x[seq_len(nc)], rep(0, nc*nr-4L))
#use list to store each row of the final matrix
yl <- split(y, (seq_along(y)-1L) %% nc)

#for subsequent values
for (k in (nc+1L):length(x)) {
    #find the smallest sum among the rows provided the rows are not totally filled
    idx <- names(which.min(lapply(yl[sapply(yl, function(x) any(x==0))], sum)))

    #store this value the appropriate row
    yl[[idx]][min(which(yl[[idx]]==0L))] <- x[k]
}

#output desired matrix
matrix(unlist(yl), ncol=nc)

output:

     [,1] [,2] [,3] [,4]
[1,]   26   23   21   10
[2,]    5    7    8   10
[3,]    0    1    3    9

Upvotes: 1

Related Questions