jmparejaz
jmparejaz

Reputation: 162

Generate conditioned random binary array matrix in python

what could be the most efficient way to generate a mxn binary array matrix which is contrained to the sum per column is equal 0 or 1 ?

somethin like this

[[0,0,1,0,0],
[1,1,0,0,0],
[0,0,0,0,1]
[0,0,0,0,0]]

m and n are going to be fixed, but n is larger than 500000 so iteration methods could take a long time until an appropied matrix is found.

Upvotes: 2

Views: 1665

Answers (4)

Warren Weckesser
Warren Weckesser

Reputation: 114781

You are selecting a random subset of the columns, and then for each column, a random row. Here's one way to do that using numpy. The binomial distribution is used to select which columns get a 1. Change the second argument of numpy.random.binomial to tweak the density of columns with a 1.

In [156]: m = 5

In [157]: n = 12

In [158]: a = np.zeros((m, n), dtype=int)

In [159]: cols = np.random.binomial(1, 0.7, size=n)

In [160]: a[np.random.randint(0, m, size=cols.sum()), np.nonzero(cols)[0]] = 1

In [161]: a
Out[161]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0],
       [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0]])

If you want a 1 in each column, here's a fairly concise method:

In [103]: m = 5

In [104]: n = 12

In [105]: a = (np.random.randint(0, m, size=n) == np.arange(m).reshape(-1, 1)).astype(int)

In [106]: a
Out[106]: 
array([[0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1],
       [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 0],
       [0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0],
       [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

np.random.randint(0, m, size=n) is the random selection of row indices, one for each column. np.arange(m).reshape(-1, 1) is the sequence [0, 1, ..., m-1] stored in an array with shape (m, 1). When this is compared to the random values, broadcasting applies, so an boolean array with shape (m, n) is created. Just convert that to integers, and you have your result.

Upvotes: 5

Brobin
Brobin

Reputation: 3326

You can use the standard library and a few clever list comprehensions to do this. First, we create the columns, with one in each being 1 or 0 so we satisfy the sum constraint. Then we flip the columns into rows to get the result.

from random import choice, randint

def generate_matrix(m, n):
    # Generate the columns
    columns = []
    for x in range(n):
        column = [0 for y in range(m)]
        column[randint(0, m - 1)] = choice([0, 1])
        columns.append(column)
    # Rotate the columns into rows
    rows = [
        [c[x] for c in columns]
        for x in range(m)
    ]
    return rows

m, n = 5, 4
matrix = generate_matrix(m, n)

Example output:

[0, 1, 0, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]
[0, 0, 1, 0]
[0, 0, 0, 0]

Upvotes: 1

Jean-François Fabre
Jean-François Fabre

Reputation: 140168

  • First, generate a zero-filled matrix using a list comprehension and list multiplication (even faster) to generate each row. This is the step where you have to be fast. Transposing/copying data would hinder performance.
  • Then loop and choose a number. Between 0 and number of rows * 2 (so it sometimes goes off limits, leaving a 0-filled column). If it's in range, put a 1 there (you could change by (3*nb_rows)//2 to increase the number of ones. This step is fast: memory is already allocated, just mark cells.

code:

import random

m=7
n=5

matrix = [[0] * m for _ in range(n)]
print(matrix)
for i in range(m):
    a = random.randint(0,(3*n)//2)  # 66% chances to get a 1 somewhere
    if a < n:
        matrix[a][i] = 1

print(matrix)

a result:

[[0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0]]
[[0, 0, 0, 1, 0, 0, 0], [1, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0], [0, 1, 0, 0, 0, 1, 0]]

Upvotes: 1

Benjamin
Benjamin

Reputation: 11860

You can select which column gets a value of 1:

a = numpy.zeros((ysize, xsize))
a[numpy.arange(ysize), numpy.random.choice(numpy.arange(xsize), ysize, replace=False)] = 1

Upvotes: 1

Related Questions