Marion
Marion

Reputation: 319

Generate a matrix of transition probabilities for bit strings of given size following some probability distribution

I want to create an 8x8 matrix which provides the error probabilities in bit communication. The matrix looks as follows:enter image description here

The columns amount to the observed quantities and the rows to the measured quantities. An element p[i,j] amounts to the conditional probability p(j|i). For example, the element p[0,1] gives the probability to observe the string 001 when the actual value is 000, i.e., it measures p(001|000).

Question: How can I create such a matrix in Python such that

  1. The more bit flips there are the smaller is the equivalent conditional probability (for example p(100|000)<p(110|000)?
  2. How to enable an "asymmetry". I.e., the probability of p(001|000)< p(000|001). That is, having bias that favors with higher probabilities transitions 1 to 0 than transitions 0 to 1.

Of course, the sum of probabilities in each row must equal to 1.

All in all, I want to create function in Python that takes as input an integer n (the size of the matrix, or equivalently where 2^n is the length of the bit string) and outputs a probability transition matrix with the above specified rules.

The difficulty is how to implement a probability distribution to fill the cells.

It is trivial to create an 8x8 array and fill diagonals:

P = np.zeros((8,8))
for i in range(8):
    for j in range(8):
        if i==j:
            P[i,j]=1

Similarly, its trivial to fill a given row or a given column by a fixed number. However, I cannot figure out (how to even begin) to fill such a matrix following the rules above, or even how exactly to define the distribution the elements must follow.

Upvotes: 2

Views: 720

Answers (3)

norok2
norok2

Reputation: 26956

Value- and position-independent bit transition

The probability of certain bit state transitioning to another bit state can be computed in a number of scenarios.

One of the simplest is when there is a given probability p of a certain bit transitioning to a different state, which is independent of both the bit value, the position of the bit within the bit state and the other bits transitioning.

Of course, the probability of the bit not flipping is given by q == 1 - p.

(The statistics of n independent events with two outcomes has been studied extensively.)

For more bits, the probability of multiple bits transitions can be combined via multiplication.

The probability of transition from a to b (where a and b are two bits configurations of the same length n) depend on the number of bit transitions t_ab and non-transitions s_ab == n - t_ab:

p(a, b) == (p ** t_ab) * (q ** s_ab)

For example, the transition: 0b00011 to 0b00101 is given by:

p(0b00011, 0b00101) == (q ** 3) * (p ** 2)

Note that this is different from e.g. the 0b011 to 0b101 transition probability, since the number of bits to consider plays a role.

Given a function for counting the number of 1s in a number:

def count_set_bits(num):
    result = 0
    while num:
        result += num & 1
        num >>= 1
    return result

a simple way of computing t is via the xor operator:

t = count_set_bits(a ^ b)

Thus, it is possible to compute the transition probability matrix w_bits "manually" with simple looping.

This is extremely slow to compute, unless one accelerates the explicit looping. One of the simplest acceleration for this use case is with Numba. All _nb-ending functions are accelerated with it. The fastmath flag nb.njit(fastmath=True) could be set to potentially reducing execution time by a few percent.

import numpy as np
import numba as nb


@nb.njit
def count_set_bits(num):
    result = 0
    while num:
        result += num & 1
        num >>= 1
    return result


@nb.njit
def w_bits_sym_cb_nb(n, p=0.2):
    if n > 0:
        q = 1 - p
        m = 2 ** n
        result = np.empty((m, m), dtype=np.float_)
        for i in range(m):
            for j in range(i + 1):
                t = count_set_bits_nb(i ^ j)
                s = n - t
                result[i, j] = result[j, i] = (p ** t) * (q ** s)
        return result
    else:
        return np.empty((0, 0))

(Note that also count_set_bits() has been accelerated).

Alternatively, one could get away with element-wise multiplying probability matrices constructed by repeating the base probability matrix for the 1 bit case:

  0 1
0 q p
1 p q

with power of two repetitions, e.g. for two bytes:

q p q p     q q p p
p q p q  X  q q p p
q p q p     p p q q  
p q p q     p p q q

This can again be computed with "manual" looping:

@nb.njit
def w_bits_sym_lm_nb(n, p=0.2):
    if n > 0:
        b = 2
        m = b ** n
        q = 1 - p
        base = np.array([[q, p], [p, q]])
        result = np.ones((m, m), dtype=base.dtype)
        for k in range(n):
            bk = (b ** k)
            for i in range(m):
                for j in range(m):
                    result[i, j] *= base[i // bk % b, j // bk % b]
        return result
    else:
        return np.empty((0, 0))

However, a much faster approach consist in performing the element-wise matrix multiplications with repeated elements (a polished version of @PierreD's answer) using broadcasted multiplication:

import numpy as np


def bc_mul(a, b):
    nm = len(a) * len(b)
    return (a[:, None, :, None] * b[None, :, None, :]).reshape(nm, nm)


def w_bits_sym_bm(n, p=0.2):
    if n > 0:
        base = np.array([[1 - p, p], [p, 1 - p]])
        result = base.copy()
        for i in range(1, n):
            result = bc_mul(base, result)
        return result
    else:
        return np.empty((0, 0))

Note that because bc_mul() is associative, one can write the line inside the loop either as result = bc_mul(base, result) or as result = bc_mul(result, base), but with very different performances!

This last approach is also quite fast, especially asymptotically for larger n, essentially because it performs a exponentially less multiplications.

The same can also be rewritten with Numba with similar (but marginally slower performances):

@nb.njit
def bc_mul_nb(a, b):
    n = len(a)
    m = len(b)
    nm = n * m
    result = np.empty((nm, nm), dtype=a.dtype)
    for i in range(n):
        for j in range(m):
            for k in range(n):
                for l in range(m):
                    result[i * m + j, k * m + l] = a[i, k] * b[j, l]
    return result


@nb.njit
def w_bits_sym_bm_nb(n, p=0.2):
    if n > 0:
        base = np.array([[1 - p, p], [p, 1 - p]])
        result = base.copy()
        for i in range(1, n):
            result = bc_mul_nb(base, result)
        return result
    else:
        return np.empty((0, 0))

More on execution speed (including benchmarks) below.


Value-dependent / position-independent bit transition

A slightly more complex, and more interesting scenario scenario is when the 0 to 1 and the 1 to 0 probabilities are different, but still independent on the position, etc.

Both can be computed from the base probability matrix:

    0   1
0 p00 p01
1 p10 p11

where p00, p01, p10 and p11 are the probabilities of one bit transitioning from one state to another.

Of course:

  • p00 == 1 - p01
  • p11 == 1 - p10

As before, for more bits, the probability of multiple bits transitions can be combined via multiplication.

This is essentially an asymmetric version of the above.

The probability of transition from a to b (where a and b are two bits configurations of the same length) depend on the number of transitions t00_ab, t01_ab, t10_ab, t11_ab multiplied by their respective probabilities (with the notation used for the symmetric case, t01 and t10 correspond to t, t00 and t11 correspond to s):

p(a, b) == (
    (p00 ** t00_ab) *
    (p01 ** t01_ab) *
    (p10 ** t10_ab) *
    (p11 ** t11_ab))

For example, the transition: 0b00011 to 0b00101 is given by:

p(0b00011, 0b00101) == (p00 ** 2) * (p01 ** 1) * (p10 ** 1) * (p11 ** 1)

Of course, all this can be computed similarly to the above. The set bit counting method can be either used directly on ~a & b and a & ~b alongside a & b to count bit transitions:

@nb.njit
def w_bits_cb_nb(n, p01=0.2, p10=-1):
    if n > 0:
        p10 = p10 if p10 >= 0 else p01
        p00 = 1 - p01
        p11 = 1 - p10
        m = 2 ** n
        result = np.empty((m, m), dtype=np.float_)
        for i in range(m):
            for j in range(m):
                t11 = count_set_bits_nb(i & j)
                t01 = count_set_bits_nb(~i & j)
                t10 = count_set_bits_nb(i & ~j)
                t00 = n - (t11 + t01 + t10)
                result[i, j] = \
                    (p00 ** t00) * (p11 ** t11) * (p01 ** t01) * (p10 ** t10)
        return result
    else:
        return np.empty((0, 0))

or could be done slightly more efficiently in a single loop (similar but faster than what is present in @Viglione's current answer):

@nb.njit
def bit_diff_nb(a, b, n):
    t11 = t01 = t10 = 0
    t00 = n
    while a | b:
        aa = a & 1
        bb = b & 1
        t11 += aa & bb
        t01 += ~aa & bb
        t10 += aa & ~bb
        a >>= 1
        b >>= 1
    t00 = n - (t11 + t01 + t10)
    return t00, t11, t01, t10


@nb.njit
def w_bits_bd_nb(n, p01=0.2, p10=-1):
    if n > 0:
        p10 = p10 if p10 >= 0 else p01
        p00 = 1 - p01
        p11 = 1 - p10
        m = 2 ** n
        result = np.empty((m, m), dtype=np.float_)
        for i in range(m):
            for j in range(m):
                t00, t11, t01, t10 = bit_diff_nb(i, j, n)
                result[i, j] = \
                    (p00 ** t00) * (p11 ** t11) * (p01 ** t01) * (p10 ** t10)
        return result
    else:
        return np.empty((0, 0))

Alternatively, all the other methods can be easily extended to this case:

@nb.njit
def w_bits_lm_nb(n, p01=0.2, p10=-1):
    if n > 0:
        p10 = p10 if p10 >= 0 else p01
        b = 2
        m = b ** n
        base = np.array([[1 - p01, p01], [p10, 1 - p10]])
        result = np.ones((m, m), dtype=base.dtype)
        for k in range(n):
            bk = (b ** k)
            for i in range(m):
                for j in range(m):
                    result[i, j] *= base[i // bk % b, j // bk % b]
        return result
    else:
        return np.empty((0, 0))
def w_bits_bm(n, p01=0.1, p10=-1):
    if n > 0:
        p10 = p10 if p10 >= 0.0 else p01
        base = np.array([[1 - p01, p01], [p10, 1 - p10]])
        result = base.copy()
        for i in range(1, n):
            result = bc_mul(base, result)
        return result
    else:
        return np.empty((0, 0))
def w_bits_bmi(n, p01=0.1, p10=-1):
    if n > 0:
        p10 = p10 if p10 >= 0.0 else p01
        base = np.array([[1 - p01, p01], [p10, 1 - p10]])
        result = base.copy()
        for i in range(1, n):
            result = bc_mul(result, base)
        return result
    else:
        return np.empty((0, 0))

Result Consistency

For completeness, I also include the currently accepted and top voted answer approach (which is similar to w_bits_bd_nb() but using binary strings and without acceleration) with some bridging code to get the underlying NumPy array:

import pandas as pd


def calc_p(sent, observed, p01, p10):
    p00 = 1 - p01
    p11 = 1 - p10
    r = 1
    for i, _ in enumerate(sent):
        if sent[i] == "0":
            r *= p00 if observed[i] == "0" else p01
        else:
            r *= p10 if observed[i] == "0" else p11
    return r


def generate_error_matrix(n, p01, p10):
    labels = [f"{i:0{n}b}" for i in range(0, 2 ** n)]
    result = pd.DataFrame(index=labels, columns=labels)
    for rowIndex, row in result.iterrows():
        for columnIndex, _ in row.items():
            result.at[rowIndex, columnIndex] = calc_p(rowIndex, columnIndex, p01, p10)
    return result

    
def w_bits_bs_pd(n, p01=0.2, p10=-1):
    p10 = p10 if p10 >= 0.0 else p01
    return generate_error_matrix(n, p01, p10).to_numpy().astype(float)
funcs = (
    w_bits_bm, w_bits_bmi,
    w_bits_cb_nb, w_bits_bd_nb, w_bits_lm_nb,
    w_bits_bm_nb, w_bits_bmi_nb,
    w_bits_sym_cb_nb, w_bits_sym_bm_nb, w_bits_sym_lm_nb,
    w_bits_bs_pd)


n = 2
base = funcs[0](n)
print(f"{'ProbRowsSumTo1:':>27} {np.allclose(np.sum(base, 0), np.ones(2 ** n))}")

x = w_bits_bm(10, 0.2, 0.2)
print(f"{'(p01 == p10) ->  Symmetric:':>27} {np.allclose(x, x.T)}")
x = w_bits_bm(10, 0.2, 0.4)
print(f"{'(p01 != p10) -> Asymmetric:':>27} {not np.allclose(x, x.T)}")
print()
for func in funcs:
    res = func(n)
    print(f"{func.__name__!s:>20}  Same: {np.allclose(base, res)}")
    print(func(2))
    print()
            ProbRowsSumTo1: True
(p01 == p10) ->  Symmetric: True
(p01 != p10) -> Asymmetric: True

           w_bits_bm  Same: True
[[0.64 0.16 0.16 0.04]
 [0.16 0.64 0.04 0.16]
 [0.16 0.04 0.64 0.16]
 [0.04 0.16 0.16 0.64]]

          w_bits_bmi  Same: True
[[0.64 0.16 0.16 0.04]
 [0.16 0.64 0.04 0.16]
 [0.16 0.04 0.64 0.16]
 [0.04 0.16 0.16 0.64]]
...

The code below shows that:

  • all functions do give the same result
  • if p01 == p10 the transition matrix is symmetric
  • if p01 != p10 the transition matrix is asymmetric
  • all rows add up to one (individually)

Benchmarks

Since most of the symmetric implementations are very similar to the asymmetric ones, they have been omitted from the benchmarks.

funcs = (
    w_bits_bm, w_bits_bmi,
    w_bits_cb_nb, w_bits_bd_nb, w_bits_lm_nb,
    w_bits_bm_nb, w_bits_bmi_nb,
    w_bits_sym_cb_nb, w_bits_bs_pd)

timings = {}
for n in range(1, 12):
    print(f"n = {n}")
    timings[n] = []
    base = funcs[0](n)
    for func in funcs:
        res = func(n)
        timed = %timeit -r 4 -n 8 -q -o func(n)
        timing = timed.best * 1e6
        timings[n].append(timing)
        print(f"{func.__name__:>24}  {np.allclose(base, res)}  {timing:10.3f} µs")

to be plotted with:

import pandas as pd


df = pd.DataFrame(data=timings, index=[func.__name__ for func in funcs]).transpose()
df.plot(marker='o', logy=True, xlabel='Num. bits n / #', ylabel='Best timing / µs')

producing:

benchmarks

Which do indicate that broadcasted multiplication based solutions to be the asymptotically for larger n the most performant, but overall quite performant at all scales.

Note that since the computational complexity grown exponentially, the timings have been plotted with y-logarithmic scale.

Note also that w_bits_bs_pd() is some orders of magnitudes slower than the others.


Nicer Output

As usual, when dealing with well-known objects like tables / matrices, it is beneficial to work with tools specific to that.

If one wants to have nice looking output, one could use Pandas (similarly to what is done in @Viglione's answer) and Seaborn for nicer visualization:

import pandas as pd
import seaborn as sns


def gen_bit_transitions(n, p01=0.2, p10=-1, func=w_bits_bm):
    data = func(n, p01, p10)
    labels = [f"{i:0{n}b}" for i in range(2**n)]
    return pd.DataFrame(data, index=labels, columns=labels)
df = gen_bit_transitions(3, 0.4, 0.2)
sns.set(rc={'figure.figsize': (8, 7)})
sns.heatmap(df, annot=True, vmin=0.0, vmax=1.0)

plot_n3

df = gen_bit_transitions(5, 0.4, 0.2)
sns.set(rc={'figure.figsize': (9, 8)})
sns.heatmap(df, annot=False, vmin=0.0, vmax=1.0)

plot_n5

Upvotes: 1

Salvatore
Salvatore

Reputation: 12102

It turns out you can do this simply without numpy or scipy. I use pandas for nice printing.

The logic is that for each bit, you have a probability of flipping (p01 or p10) or remaining the same (p00 or p11). Transforming one bit string to another requires multiplying the appropriate probability for each of n bits.

For example: P(010|001) = P(0->0) * P(1->0) * P(0->1) = p00 * p10 * p01

This process is repeated for every sent and observed combination.

You can further reduce the two level if statement below to one line using nested ternary assignment, but I think this is a nice balance of being concise and readable:

import pandas as pd


def p(sent, observed, p01, p10):
    """Return the probability of 'sent' being received as 'observed'
    given p01 (the probability a bit flips from a 0->1) and p10 (the
    probability a bit flips from 1->0).
    """
    p00 = 1 - p01
    p11 = 1 - p10
    r = 1
    for i, _ in enumerate(sent):
        if sent[i] == "0":
            r *= p00 if observed[i] == "0" else p01
        else:
            r *= p10 if observed[i] == "0" else p11
    return r

def generate_error_matrix(n, p01, p10):
    """Print a matrix of the transitions of all permutations of bit
    errors for a given bit length.

    Parameters:
        n - the number of bits
        p01 - probability of a bit flipping from 0 to 1
        p10 - probability of a bit flipping from 1 to 0
    """
    labels = [f"{i:0{n}b}" for i in range(0, 2**n)]
    result = pd.DataFrame(index=labels, columns=labels)
    for rowIndex, row in result.iterrows():
        for columnIndex, _ in row.items():
            result.at[rowIndex, columnIndex] = p(rowIndex, columnIndex, p01, p10)
    return result

Here's an example:

print(generate_error_matrix(n=3, p01=0.2, p10=0.1))
       000    001    010    011    100    101    110    111
000  0.512  0.128  0.128  0.032  0.128  0.032  0.032  0.008
001  0.064  0.576  0.016  0.144  0.016  0.144  0.004  0.036
010  0.064  0.016  0.576  0.144  0.016  0.004  0.144  0.036
011  0.008  0.072  0.072  0.648  0.002  0.018  0.018  0.162
100  0.064  0.016  0.016  0.004  0.576  0.144  0.144  0.036
101  0.008  0.072  0.002  0.018  0.072  0.648  0.018  0.162
110  0.008  0.002  0.072  0.018  0.072  0.018  0.648  0.162
111  0.001  0.009  0.009  0.081  0.009  0.081  0.081  0.729

And some edge cases:

Zeroes always flip to ones, ones never flip to zeroes:

print(generate_error_matrix(n=3, p01=1, p10=0))
    000 001 010 011 100 101 110 111
000   0   0   0   0   0   0   0   1
001   0   0   0   0   0   0   0   1
010   0   0   0   0   0   0   0   1
011   0   0   0   0   0   0   0   1
100   0   0   0   0   0   0   0   1
101   0   0   0   0   0   0   0   1
110   0   0   0   0   0   0   0   1
111   0   0   0   0   0   0   0   1

Ones always flip to zeroes, zeroes never flip to ones:

print(generate_error_matrix(n=3, p01=0, p10=1))
    000 001 010 011 100 101 110 111
000   1   0   0   0   0   0   0   0
001   1   0   0   0   0   0   0   0
010   1   0   0   0   0   0   0   0
011   1   0   0   0   0   0   0   0
100   1   0   0   0   0   0   0   0
101   1   0   0   0   0   0   0   0
110   1   0   0   0   0   0   0   0
111   1   0   0   0   0   0   0   0

Bits always flip:

print(generate_error_matrix(n=3, p01=1, p10=1))
    000 001 010 011 100 101 110 111
000   0   0   0   0   0   0   0   1
001   0   0   0   0   0   0   1   0
010   0   0   0   0   0   1   0   0
011   0   0   0   0   1   0   0   0
100   0   0   0   1   0   0   0   0
101   0   0   1   0   0   0   0   0
110   0   1   0   0   0   0   0   0
111   1   0   0   0   0   0   0   0

Every bit has a 50% chance of flipping, regardless of direction:

print(generate_error_matrix(n=3, p01=0.5, p10=0.5))
       000    001    010    011    100    101    110    111
000  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
001  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
010  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
011  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
100  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
101  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
110  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125
111  0.125  0.125  0.125  0.125  0.125  0.125  0.125  0.125

Upvotes: 1

Pierre D
Pierre D

Reputation: 26321

If the probability of bit transition is only dependent on the original bit value, but independent of the position (i.e. P(xy|ab) == P(yx|ba), then you can simply block-multiply a kernel of transition probabilities:

Let x be a 2x2 matrix such that x[i,j] is the probability of observing bit j given the truth i. I.e.:

x = [[a, b]
     [c, d]]

The 2-bit probability matrix is:

x2 = [[a, a, b, b],          [[a, b, a, b],
      [a, a, b, b],    *      [c, d, c, d],
      [c, c, d, d],           [a, b, a, b],
      [c, c, d, d]]           [c, d, c, d]]

Such a block-multiplication can be expressed simply in numpy:

def bmul(a, x):
    n = a.shape[0] * x.shape[0]
    return (a[:, None, :, None] * x[None, :, None, :]).reshape(n, n)

Example:

u = .2  # "up": p(1|0)
d = .1  # "down": p(0|1)
x = np.array([[1-u, u], [d, 1-d]])

>>> x
array([[0.8, 0.2],
       [0.1, 0.9]])

x2 = bmul(x, x)
>>> x2
array([[0.64, 0.16, 0.16, 0.04],
       [0.08, 0.72, 0.02, 0.18],
       [0.08, 0.02, 0.72, 0.18],
       [0.01, 0.09, 0.09, 0.81]])

x3 = bmul(x2, x)
>>> x3
array([[0.512, 0.128, 0.128, 0.032, 0.128, 0.032, 0.032, 0.008],
       [0.064, 0.576, 0.016, 0.144, 0.016, 0.144, 0.004, 0.036],
       [0.064, 0.016, 0.576, 0.144, 0.016, 0.004, 0.144, 0.036],
       [0.008, 0.072, 0.072, 0.648, 0.002, 0.018, 0.018, 0.162],
       [0.064, 0.016, 0.016, 0.004, 0.576, 0.144, 0.144, 0.036],
       [0.008, 0.072, 0.002, 0.018, 0.072, 0.648, 0.018, 0.162],
       [0.008, 0.002, 0.072, 0.018, 0.072, 0.018, 0.648, 0.162],
       [0.001, 0.009, 0.009, 0.081, 0.009, 0.081, 0.081, 0.729]])

That last value is the matrix you are looking for.

Random check:

# P(100|010) is u*d*(1-u), and we should find it in x3[4,2]
>>> u * d * (1-u)
0.016000000000000004

>>> x3[4,2]
0.016000000000000004

Interesting fact:

bmul is associative but not commutative. In other words:

  • bmul(bmul(a, b), c) == bmul(a, bmul(b, c), but
  • bmul(a, b) != bmul(b, a)

Upvotes: 1

Related Questions