Reputation: 11637
I was trying to write a function that gets a matrix of 2D points and a probability p
and change or swap each points coordinates with probability p
So I asked a question and I was trying to use a binary sequence as an array of the powers of a specific matrix swap_matrix=[[0,1],[1,0]]
to swap randomly (with a specific proportion) the coordinates of a given set of 2D points. However I realised that power function only accepts integer values and not arrays. And shuffle is as I can understand for the whole matrix and you cannot specify a specific dimension.
Having either of these two functions is OK.
For example:
swap(a=[[1,2],[2,3],[3,4],[3,5],[5,6]],b=[0,0,0,1,1])
should return [[1,2],[2,3],[3,4],[5,3],[6,5]]
The idea that just popped up and now I am editing is:
def swap(mat,K,N):
#where K/N is the proportion and K and N are natural numbers
#mat is a N*2 matrix that I am planning to randomly changes
#it coordinates of each row or keep it as it is
a=[[[0,1],[1,0]]]
b=[[[1,0],[0,1]]]
a=np.repeat(a,K,axis=0)
b=np.repeat(b,N-K,axis=0)
out=np.append(a,b,axis=0)
np.random.shuffle(out)
return np.multiply(mat,out.T)
Where I get an error cause I cannot flatten only once to make the matrices multipliable!
Again I am looking for an efficient method(vectorized in Matlab context).
P.S. In my special case the matrix is in the shape (N,2)
and with the second column as all ones if that would help.
Upvotes: 1
Views: 510
Reputation:
Maybe this is good enough for your purposes. In a quick test it appears to be about 13x faster than the blunt for-loop approach (@Naji, posting your "inefficient" code is helpful for making a comparison).
Edited my code following Jaime's comment
def swap(a, b):
a = np.copy(a)
b = np.asarray(b, dtype=np.bool)
a[b] = a[b, ::-1] # equivalent to: a[b] = np.fliplr(a[b])
return a
# the following is faster, but modifies the original array
def swap_inplace(a, b):
b = np.asarray(b, dtype=np.bool)
a[b] = a[b, ::-1]
print swap(a=[[1,2],[2,3],[3,4],[3,5],[5,6]],b=[0,0,0,1,1])
Outputs:
[[1 2]
[2 3]
[3 4]
[5 3]
[6 5]]
Edit to include more detailed timings
I wanted to know if I could speed this up still with Cython, so I investigated the efficiency some more :-) The results are worth mentioning I think (since efficiency is part of the actual question), but I do appologize in advance for the amount of additional code.
First the results.. The "cython" function is clearly the fastest of all, another 10x faster than the proposed Numpy solution above. The "blunt loop approach" I mentioned is given by the function named "loop", but as it turns out there are much faster methods conceivable. My pure Python solution is only 3x slower than the vectorized Numpy code above! Another thing to note is that "swap_inplace" was most of the time only marginally faster than "swap". Also the timings vary a bit with different random matrices a
and b
... So now you know :-)
function | milisec | normalized
-------------+---------+-----------
loop | 184 | 10.
double_loop | 84 | 4.7
pure_python | 51 | 2.8
swap | 18 | 1
swap_inplace | 17 | 0.95
cython | 1.9 | 0.11
And the rest of code I used (it seems I took this way to seriously :P):
def loop(a, b):
a_c = np.copy(a)
for i in xrange(a.shape[0]):
if b[i]:
a_c[i,:] = a[i, ::-1]
def double_loop(a, b):
a_c = np.copy(a)
n, m = a_c.shape
for i in xrange(n):
if b[i]:
for j in xrange(m):
a_c[i, j] = a[i, m-j-1]
return a_c
from copy import copy
def pure_python(a, b):
a_c = copy(a)
n, m = len(a), len(a[0])
for i in xrange(n):
if b[i]:
for j in xrange(m):
a_c[i][j] = a[i][m-j-1]
return a_c
import pyximport; pyximport.install()
import testcy
def cython(a, b):
return testcy.swap(a, np.asarray(b, dtype=np.uint8))
def rand_bin_array(K, N):
arr = np.zeros(N, dtype=np.bool)
arr[:K] = 1
np.random.shuffle(arr)
return arr
N = 100000
a = np.random.randint(0, N, (N, 2))
b = rand_bin_array(0.33*N, N)
# before timing the pure python solution I first did:
a = a.tolist()
b = b.tolist()
######### In the file testcy.pyx #########
#cython: boundscheck=False
#cython: wraparound=False
import numpy as np
cimport numpy as np
def swap(np.ndarray[np.int_t, ndim=2] a, np.ndarray[np.uint8_t, ndim=1] b):
cdef np.ndarray[np.int_t, ndim=2] a_c
cdef int n, m, i, j
a_c = a.copy()
n = a_c.shape[0]
m = a_c.shape[1]
for i in range(n):
if b[i]:
for j in range(m):
a_c[i, j] = a[i, m-j-1]
return a_c
Upvotes: 3