Reputation: 319
I want to create an 8x8 matrix which provides the error probabilities in bit communication. The matrix looks as follows:
The columns amount to the observed quantities and the rows to the measured quantities. An element p[i,j]
amounts to the conditional probability p(j|i)
. For example, the element
p[0,1]
gives the probability to observe the string 001
when the actual value is 000
, i.e., it measures p(001|000)
.
Question: How can I create such a matrix in Python such that
p(100|000)<p(110|000)
?p(001|000)< p(000|001)
. That is, having bias that favors with higher probabilities transitions 1 to 0 than transitions 0 to 1.Of course, the sum of probabilities in each row must equal to 1.
All in all, I want to create function in Python that takes as input an integer n
(the size of the matrix, or equivalently where 2^n
is the length of the bit string) and outputs a probability transition matrix with the above specified rules.
The difficulty is how to implement a probability distribution to fill the cells.
It is trivial to create an 8x8 array and fill diagonals:
P = np.zeros((8,8))
for i in range(8):
for j in range(8):
if i==j:
P[i,j]=1
Similarly, its trivial to fill a given row or a given column by a fixed number. However, I cannot figure out (how to even begin) to fill such a matrix following the rules above, or even how exactly to define the distribution the elements must follow.
Upvotes: 2
Views: 720
Reputation: 26956
The probability of certain bit state transitioning to another bit state can be computed in a number of scenarios.
One of the simplest is when there is a given probability p
of a certain bit transitioning to a different state, which is independent of both the bit value, the position of the bit within the bit state and the other bits transitioning.
Of course, the probability of the bit not flipping is given by q == 1 - p
.
(The statistics of n
independent events with two outcomes has been studied extensively.)
For more bits, the probability of multiple bits transitions can be combined via multiplication.
The probability of transition from a
to b
(where a
and b
are two bits configurations of the same length n
) depend on the number of bit transitions t_ab
and non-transitions s_ab == n - t_ab
:
p(a, b) == (p ** t_ab) * (q ** s_ab)
For example, the transition: 0b00011
to 0b00101
is given by:
p(0b00011, 0b00101) == (q ** 3) * (p ** 2)
Note that this is different from e.g. the 0b011
to 0b101
transition probability, since the number of bits to consider plays a role.
Given a function for counting the number of 1s in a number:
def count_set_bits(num):
result = 0
while num:
result += num & 1
num >>= 1
return result
a simple way of computing t
is via the xor
operator:
t = count_set_bits(a ^ b)
Thus, it is possible to compute the transition probability matrix w_bits
"manually" with simple looping.
This is extremely slow to compute, unless one accelerates the explicit looping.
One of the simplest acceleration for this use case is with Numba.
All _nb
-ending functions are accelerated with it.
The fastmath flag nb.njit(fastmath=True)
could be set to potentially reducing execution time by a few percent.
import numpy as np
import numba as nb
@nb.njit
def count_set_bits(num):
result = 0
while num:
result += num & 1
num >>= 1
return result
@nb.njit
def w_bits_sym_cb_nb(n, p=0.2):
if n > 0:
q = 1 - p
m = 2 ** n
result = np.empty((m, m), dtype=np.float_)
for i in range(m):
for j in range(i + 1):
t = count_set_bits_nb(i ^ j)
s = n - t
result[i, j] = result[j, i] = (p ** t) * (q ** s)
return result
else:
return np.empty((0, 0))
(Note that also count_set_bits()
has been accelerated).
Alternatively, one could get away with element-wise multiplying probability matrices constructed by repeating the base probability matrix for the 1 bit case:
0 1
0 q p
1 p q
with power of two repetitions, e.g. for two bytes:
q p q p q q p p
p q p q X q q p p
q p q p p p q q
p q p q p p q q
This can again be computed with "manual" looping:
@nb.njit
def w_bits_sym_lm_nb(n, p=0.2):
if n > 0:
b = 2
m = b ** n
q = 1 - p
base = np.array([[q, p], [p, q]])
result = np.ones((m, m), dtype=base.dtype)
for k in range(n):
bk = (b ** k)
for i in range(m):
for j in range(m):
result[i, j] *= base[i // bk % b, j // bk % b]
return result
else:
return np.empty((0, 0))
However, a much faster approach consist in performing the element-wise matrix multiplications with repeated elements (a polished version of @PierreD's answer) using broadcasted multiplication:
import numpy as np
def bc_mul(a, b):
nm = len(a) * len(b)
return (a[:, None, :, None] * b[None, :, None, :]).reshape(nm, nm)
def w_bits_sym_bm(n, p=0.2):
if n > 0:
base = np.array([[1 - p, p], [p, 1 - p]])
result = base.copy()
for i in range(1, n):
result = bc_mul(base, result)
return result
else:
return np.empty((0, 0))
Note that because bc_mul()
is associative, one can write the line inside the loop either as result = bc_mul(base, result)
or as result = bc_mul(result, base)
, but with very different performances!
This last approach is also quite fast, especially asymptotically for larger n
, essentially because it performs a exponentially less multiplications.
The same can also be rewritten with Numba with similar (but marginally slower performances):
@nb.njit
def bc_mul_nb(a, b):
n = len(a)
m = len(b)
nm = n * m
result = np.empty((nm, nm), dtype=a.dtype)
for i in range(n):
for j in range(m):
for k in range(n):
for l in range(m):
result[i * m + j, k * m + l] = a[i, k] * b[j, l]
return result
@nb.njit
def w_bits_sym_bm_nb(n, p=0.2):
if n > 0:
base = np.array([[1 - p, p], [p, 1 - p]])
result = base.copy()
for i in range(1, n):
result = bc_mul_nb(base, result)
return result
else:
return np.empty((0, 0))
More on execution speed (including benchmarks) below.
A slightly more complex, and more interesting scenario scenario is when the 0 to 1 and the 1 to 0 probabilities are different, but still independent on the position, etc.
Both can be computed from the base
probability matrix:
0 1
0 p00 p01
1 p10 p11
where p00
, p01
, p10
and p11
are the probabilities of one bit transitioning from one state to another.
Of course:
p00 == 1 - p01
p11 == 1 - p10
As before, for more bits, the probability of multiple bits transitions can be combined via multiplication.
This is essentially an asymmetric version of the above.
The probability of transition from a
to b
(where a
and b
are two bits configurations of the same length) depend on the number of transitions t00_ab
, t01_ab
, t10_ab
, t11_ab
multiplied by their respective probabilities (with the notation used for the symmetric case, t01
and t10
correspond to t
, t00
and t11
correspond to s
):
p(a, b) == (
(p00 ** t00_ab) *
(p01 ** t01_ab) *
(p10 ** t10_ab) *
(p11 ** t11_ab))
For example, the transition: 0b00011
to 0b00101
is given by:
p(0b00011, 0b00101) == (p00 ** 2) * (p01 ** 1) * (p10 ** 1) * (p11 ** 1)
Of course, all this can be computed similarly to the above.
The set bit counting method can be either used directly on ~a & b
and a & ~b
alongside a & b
to count bit transitions:
@nb.njit
def w_bits_cb_nb(n, p01=0.2, p10=-1):
if n > 0:
p10 = p10 if p10 >= 0 else p01
p00 = 1 - p01
p11 = 1 - p10
m = 2 ** n
result = np.empty((m, m), dtype=np.float_)
for i in range(m):
for j in range(m):
t11 = count_set_bits_nb(i & j)
t01 = count_set_bits_nb(~i & j)
t10 = count_set_bits_nb(i & ~j)
t00 = n - (t11 + t01 + t10)
result[i, j] = \
(p00 ** t00) * (p11 ** t11) * (p01 ** t01) * (p10 ** t10)
return result
else:
return np.empty((0, 0))
or could be done slightly more efficiently in a single loop (similar but faster than what is present in @Viglione's current answer):
@nb.njit
def bit_diff_nb(a, b, n):
t11 = t01 = t10 = 0
t00 = n
while a | b:
aa = a & 1
bb = b & 1
t11 += aa & bb
t01 += ~aa & bb
t10 += aa & ~bb
a >>= 1
b >>= 1
t00 = n - (t11 + t01 + t10)
return t00, t11, t01, t10
@nb.njit
def w_bits_bd_nb(n, p01=0.2, p10=-1):
if n > 0:
p10 = p10 if p10 >= 0 else p01
p00 = 1 - p01
p11 = 1 - p10
m = 2 ** n
result = np.empty((m, m), dtype=np.float_)
for i in range(m):
for j in range(m):
t00, t11, t01, t10 = bit_diff_nb(i, j, n)
result[i, j] = \
(p00 ** t00) * (p11 ** t11) * (p01 ** t01) * (p10 ** t10)
return result
else:
return np.empty((0, 0))
Alternatively, all the other methods can be easily extended to this case:
@nb.njit
def w_bits_lm_nb(n, p01=0.2, p10=-1):
if n > 0:
p10 = p10 if p10 >= 0 else p01
b = 2
m = b ** n
base = np.array([[1 - p01, p01], [p10, 1 - p10]])
result = np.ones((m, m), dtype=base.dtype)
for k in range(n):
bk = (b ** k)
for i in range(m):
for j in range(m):
result[i, j] *= base[i // bk % b, j // bk % b]
return result
else:
return np.empty((0, 0))
def w_bits_bm(n, p01=0.1, p10=-1):
if n > 0:
p10 = p10 if p10 >= 0.0 else p01
base = np.array([[1 - p01, p01], [p10, 1 - p10]])
result = base.copy()
for i in range(1, n):
result = bc_mul(base, result)
return result
else:
return np.empty((0, 0))
def w_bits_bmi(n, p01=0.1, p10=-1):
if n > 0:
p10 = p10 if p10 >= 0.0 else p01
base = np.array([[1 - p01, p01], [p10, 1 - p10]])
result = base.copy()
for i in range(1, n):
result = bc_mul(result, base)
return result
else:
return np.empty((0, 0))
For completeness, I also include the currently accepted and top voted answer approach (which is similar to w_bits_bd_nb()
but using binary strings and without acceleration) with some bridging code to get the underlying NumPy array:
import pandas as pd
def calc_p(sent, observed, p01, p10):
p00 = 1 - p01
p11 = 1 - p10
r = 1
for i, _ in enumerate(sent):
if sent[i] == "0":
r *= p00 if observed[i] == "0" else p01
else:
r *= p10 if observed[i] == "0" else p11
return r
def generate_error_matrix(n, p01, p10):
labels = [f"{i:0{n}b}" for i in range(0, 2 ** n)]
result = pd.DataFrame(index=labels, columns=labels)
for rowIndex, row in result.iterrows():
for columnIndex, _ in row.items():
result.at[rowIndex, columnIndex] = calc_p(rowIndex, columnIndex, p01, p10)
return result
def w_bits_bs_pd(n, p01=0.2, p10=-1):
p10 = p10 if p10 >= 0.0 else p01
return generate_error_matrix(n, p01, p10).to_numpy().astype(float)
funcs = (
w_bits_bm, w_bits_bmi,
w_bits_cb_nb, w_bits_bd_nb, w_bits_lm_nb,
w_bits_bm_nb, w_bits_bmi_nb,
w_bits_sym_cb_nb, w_bits_sym_bm_nb, w_bits_sym_lm_nb,
w_bits_bs_pd)
n = 2
base = funcs[0](n)
print(f"{'ProbRowsSumTo1:':>27} {np.allclose(np.sum(base, 0), np.ones(2 ** n))}")
x = w_bits_bm(10, 0.2, 0.2)
print(f"{'(p01 == p10) -> Symmetric:':>27} {np.allclose(x, x.T)}")
x = w_bits_bm(10, 0.2, 0.4)
print(f"{'(p01 != p10) -> Asymmetric:':>27} {not np.allclose(x, x.T)}")
print()
for func in funcs:
res = func(n)
print(f"{func.__name__!s:>20} Same: {np.allclose(base, res)}")
print(func(2))
print()
ProbRowsSumTo1: True
(p01 == p10) -> Symmetric: True
(p01 != p10) -> Asymmetric: True
w_bits_bm Same: True
[[0.64 0.16 0.16 0.04]
[0.16 0.64 0.04 0.16]
[0.16 0.04 0.64 0.16]
[0.04 0.16 0.16 0.64]]
w_bits_bmi Same: True
[[0.64 0.16 0.16 0.04]
[0.16 0.64 0.04 0.16]
[0.16 0.04 0.64 0.16]
[0.04 0.16 0.16 0.64]]
...
The code below shows that:
p01 == p10
the transition matrix is symmetricp01 != p10
the transition matrix is asymmetricSince most of the symmetric implementations are very similar to the asymmetric ones, they have been omitted from the benchmarks.
funcs = (
w_bits_bm, w_bits_bmi,
w_bits_cb_nb, w_bits_bd_nb, w_bits_lm_nb,
w_bits_bm_nb, w_bits_bmi_nb,
w_bits_sym_cb_nb, w_bits_bs_pd)
timings = {}
for n in range(1, 12):
print(f"n = {n}")
timings[n] = []
base = funcs[0](n)
for func in funcs:
res = func(n)
timed = %timeit -r 4 -n 8 -q -o func(n)
timing = timed.best * 1e6
timings[n].append(timing)
print(f"{func.__name__:>24} {np.allclose(base, res)} {timing:10.3f} µs")
to be plotted with:
import pandas as pd
df = pd.DataFrame(data=timings, index=[func.__name__ for func in funcs]).transpose()
df.plot(marker='o', logy=True, xlabel='Num. bits n / #', ylabel='Best timing / µs')
producing:
Which do indicate that broadcasted multiplication based solutions to be the asymptotically for larger n
the most performant, but overall quite performant at all scales.
Note that since the computational complexity grown exponentially, the timings have been plotted with y-logarithmic scale.
Note also that w_bits_bs_pd()
is some orders of magnitudes slower than the others.
As usual, when dealing with well-known objects like tables / matrices, it is beneficial to work with tools specific to that.
If one wants to have nice looking output, one could use Pandas (similarly to what is done in @Viglione's answer) and Seaborn for nicer visualization:
import pandas as pd
import seaborn as sns
def gen_bit_transitions(n, p01=0.2, p10=-1, func=w_bits_bm):
data = func(n, p01, p10)
labels = [f"{i:0{n}b}" for i in range(2**n)]
return pd.DataFrame(data, index=labels, columns=labels)
df = gen_bit_transitions(3, 0.4, 0.2)
sns.set(rc={'figure.figsize': (8, 7)})
sns.heatmap(df, annot=True, vmin=0.0, vmax=1.0)
df = gen_bit_transitions(5, 0.4, 0.2)
sns.set(rc={'figure.figsize': (9, 8)})
sns.heatmap(df, annot=False, vmin=0.0, vmax=1.0)
Upvotes: 1
Reputation: 12102
It turns out you can do this simply without numpy
or scipy
. I use pandas
for nice printing.
The logic is that for each bit, you have a probability of flipping (p01
or p10
) or remaining the same (p00
or p11
). Transforming one bit string to another requires multiplying the appropriate probability for each of n
bits.
For example: P(010|001) = P(0->0) * P(1->0) * P(0->1) = p00 * p10 * p01
This process is repeated for every sent
and observed
combination.
You can further reduce the two level if
statement below to one line using nested ternary assignment, but I think this is a nice balance of being concise and readable:
import pandas as pd
def p(sent, observed, p01, p10):
"""Return the probability of 'sent' being received as 'observed'
given p01 (the probability a bit flips from a 0->1) and p10 (the
probability a bit flips from 1->0).
"""
p00 = 1 - p01
p11 = 1 - p10
r = 1
for i, _ in enumerate(sent):
if sent[i] == "0":
r *= p00 if observed[i] == "0" else p01
else:
r *= p10 if observed[i] == "0" else p11
return r
def generate_error_matrix(n, p01, p10):
"""Print a matrix of the transitions of all permutations of bit
errors for a given bit length.
Parameters:
n - the number of bits
p01 - probability of a bit flipping from 0 to 1
p10 - probability of a bit flipping from 1 to 0
"""
labels = [f"{i:0{n}b}" for i in range(0, 2**n)]
result = pd.DataFrame(index=labels, columns=labels)
for rowIndex, row in result.iterrows():
for columnIndex, _ in row.items():
result.at[rowIndex, columnIndex] = p(rowIndex, columnIndex, p01, p10)
return result
print(generate_error_matrix(n=3, p01=0.2, p10=0.1))
000 001 010 011 100 101 110 111
000 0.512 0.128 0.128 0.032 0.128 0.032 0.032 0.008
001 0.064 0.576 0.016 0.144 0.016 0.144 0.004 0.036
010 0.064 0.016 0.576 0.144 0.016 0.004 0.144 0.036
011 0.008 0.072 0.072 0.648 0.002 0.018 0.018 0.162
100 0.064 0.016 0.016 0.004 0.576 0.144 0.144 0.036
101 0.008 0.072 0.002 0.018 0.072 0.648 0.018 0.162
110 0.008 0.002 0.072 0.018 0.072 0.018 0.648 0.162
111 0.001 0.009 0.009 0.081 0.009 0.081 0.081 0.729
print(generate_error_matrix(n=3, p01=1, p10=0))
000 001 010 011 100 101 110 111
000 0 0 0 0 0 0 0 1
001 0 0 0 0 0 0 0 1
010 0 0 0 0 0 0 0 1
011 0 0 0 0 0 0 0 1
100 0 0 0 0 0 0 0 1
101 0 0 0 0 0 0 0 1
110 0 0 0 0 0 0 0 1
111 0 0 0 0 0 0 0 1
print(generate_error_matrix(n=3, p01=0, p10=1))
000 001 010 011 100 101 110 111
000 1 0 0 0 0 0 0 0
001 1 0 0 0 0 0 0 0
010 1 0 0 0 0 0 0 0
011 1 0 0 0 0 0 0 0
100 1 0 0 0 0 0 0 0
101 1 0 0 0 0 0 0 0
110 1 0 0 0 0 0 0 0
111 1 0 0 0 0 0 0 0
print(generate_error_matrix(n=3, p01=1, p10=1))
000 001 010 011 100 101 110 111
000 0 0 0 0 0 0 0 1
001 0 0 0 0 0 0 1 0
010 0 0 0 0 0 1 0 0
011 0 0 0 0 1 0 0 0
100 0 0 0 1 0 0 0 0
101 0 0 1 0 0 0 0 0
110 0 1 0 0 0 0 0 0
111 1 0 0 0 0 0 0 0
print(generate_error_matrix(n=3, p01=0.5, p10=0.5))
000 001 010 011 100 101 110 111
000 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
001 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
010 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
011 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
100 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
101 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
110 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
111 0.125 0.125 0.125 0.125 0.125 0.125 0.125 0.125
Upvotes: 1
Reputation: 26321
If the probability of bit transition is only dependent on the original bit value, but independent of the position (i.e. P(xy|ab) == P(yx|ba)
, then you can simply block-multiply a kernel of transition probabilities:
Let x
be a 2x2 matrix such that x[i,j]
is the probability of observing bit j
given the truth i
. I.e.:
x = [[a, b]
[c, d]]
The 2-bit probability matrix is:
x2 = [[a, a, b, b], [[a, b, a, b],
[a, a, b, b], * [c, d, c, d],
[c, c, d, d], [a, b, a, b],
[c, c, d, d]] [c, d, c, d]]
Such a block-multiplication can be expressed simply in numpy
:
def bmul(a, x):
n = a.shape[0] * x.shape[0]
return (a[:, None, :, None] * x[None, :, None, :]).reshape(n, n)
Example:
u = .2 # "up": p(1|0)
d = .1 # "down": p(0|1)
x = np.array([[1-u, u], [d, 1-d]])
>>> x
array([[0.8, 0.2],
[0.1, 0.9]])
x2 = bmul(x, x)
>>> x2
array([[0.64, 0.16, 0.16, 0.04],
[0.08, 0.72, 0.02, 0.18],
[0.08, 0.02, 0.72, 0.18],
[0.01, 0.09, 0.09, 0.81]])
x3 = bmul(x2, x)
>>> x3
array([[0.512, 0.128, 0.128, 0.032, 0.128, 0.032, 0.032, 0.008],
[0.064, 0.576, 0.016, 0.144, 0.016, 0.144, 0.004, 0.036],
[0.064, 0.016, 0.576, 0.144, 0.016, 0.004, 0.144, 0.036],
[0.008, 0.072, 0.072, 0.648, 0.002, 0.018, 0.018, 0.162],
[0.064, 0.016, 0.016, 0.004, 0.576, 0.144, 0.144, 0.036],
[0.008, 0.072, 0.002, 0.018, 0.072, 0.648, 0.018, 0.162],
[0.008, 0.002, 0.072, 0.018, 0.072, 0.018, 0.648, 0.162],
[0.001, 0.009, 0.009, 0.081, 0.009, 0.081, 0.081, 0.729]])
That last value is the matrix you are looking for.
Random check:
# P(100|010) is u*d*(1-u), and we should find it in x3[4,2]
>>> u * d * (1-u)
0.016000000000000004
>>> x3[4,2]
0.016000000000000004
Interesting fact:
bmul
is associative but not commutative. In other words:
bmul(bmul(a, b), c) == bmul(a, bmul(b, c)
, butbmul(a, b) != bmul(b, a)
Upvotes: 1