Reputation: 1265
I have an array A
with size 600x6
that each row is a vector and I want to calculate the distance of each row from all other rows of the array. calculating the distance ( BD distance) is easy and I can calculate all the distances and put them in a matrix D(600x600)
, but during my code, I have just the value of the row not the index of it and so I cannot use D to find the distance quickly. so I have to calculate the distance again. my question is it a way to assign a label or index to each row of A
during the code? for example, I have A1
and A2
so I very fast find out that I have to extract D1,2
for distance. I am not very familiar with python. Could you please tell me how can I do this without calculating the distance each time?
as you can see in the following code, the centroid during the next step of the code will change. so I have to calculate the BD distance again which is time-consuming. but if I could save the index of centroid
I could extract the distance from my distance matrix very fast.
def kmeans_BD(psnr_bitrate,K,centroid):
m=psnr_bitrate.shape[0]#number of samples
n=psnr_bitrate.shape[1]#number of bitrate
# creating an empty array
BD=np.zeros((m,K))
#weight of BD_rate
wr=0.5
#weight of BD_Q
wq=0.5
n_itr=10
# finding distance between for each centroid
for itr in range(n_itr):
for k in range(K):
for i in range(len(psnr_bitrate)):
BD_R=bd_rate(rate,centroid[k,:],rate,psnr_bitrate[i,:])
if BD_R==-2:
BD_R=np.inf
BD_Q=bd_PSNR(rate,centroid[k,:],rate,psnr_bitrate[i,:])
if BD_Q==-2:
BD_Q=np.inf
BD[i,k]=np.abs(wr*BD_R+wq*BD_Q)
Upvotes: 1
Views: 132
Reputation:
This answer is an updated one implementing all the appreciated remarks made in the comments about the problems with implementing the before provided code.
The getIndex()
function is the core of the provided solution requested in the question and should now work with all possible array types (Python list, numpy ndarray, sympy Array, ...). It uses different methods for getting the array index while given a value for an array item. If no for the datatype specialized way is available the index will be found using a loop with Python all()
function.
To demonstrate the functionality the code comes with a getDistance()
function and an example of array data. The assert
statements in the code assure that the code works as expected:
def getDistance(vector_1, vector_2, vector_matrix_A, distance_matrix_D):
try:
distance = distance_matrix_D[
getIndex(vector_matrix_A, vector_1)][
getIndex(vector_matrix_A, vector_2)]
return distance
except:
print("getDistance() exception, returning None")
return None
def getIndex(vectorArray, vector, verbose=True):
if isinstance(vectorArray, list) and isinstance(vector, list):
if verbose: print('list.index()')
return vectorArray.index(vector)
try:
import numpy
if isinstance(vectorArray, numpy.ndarray) and isinstance(vector, numpy.ndarray):
indx, = numpy.where(numpy.all(vectorArray==vector, axis=1))
if verbose: print('numpy.where()')
return indx[0]
except:
pass # no numpy
for indx, item in enumerate(vectorArray):
try:
if vector == item:
if verbose: print('if vector == item')
return indx
except:
if all( vector[i] == item[i] for i in range(len(vector))):
if verbose: print('if all()')
return indx
return None
A = [ [i*item for i in (range(1,4))] for item in range(1,7)]
assert A == [[1, 2, 3], [2, 4, 6], [3, 6, 9], [4, 8, 12], [5, 10, 15], [6, 12, 18]]
D = []
for row in range(6):
column = []
for colval in range(1+6*row,7+6*row):
column.append(colval)
D.append(column)
assert D == [
[ 1, 2, 3, 4, 5, 6],
[ 7, 8, 9, 10, 11, 12],
[13, 14, 15, 16, 17, 18],
[19, 20, 21, 22, 23, 24],
[25, 26, 27, 28, 29, 30],
[31, 32, 33, 34, 35, 36],
]
vector_3 = A[3]
vector_5 = A[5]
assert getDistance( vector_3, vector_5, A, D) == 24
import numpy
np_A = numpy.array(A)
np_vector_3 = numpy.array(vector_3)
np_vector_5 = numpy.array(vector_5)
assert getDistance(np_vector_3, np_vector_5, np_A, D) == 24
import sympy
sp_A = sympy.Array(A)
sp_vector_3 = sympy.Array(vector_3)
sp_vector_5 = sympy.Array(vector_5)
assert getDistance(sp_vector_3, sp_vector_5, sp_A, D) == 24
Upvotes: 2