Reputation: 25

Multiply vector on dataframe - vectorized

I have a pandas dataframe of size 2441x1441 (A), which is zero in the upper triangle - the diagnoal has values. I would like to multiply each column of this with a vector of length 2441 (B). The tricky part is, that I want the first non-zero value of A multiplied with the first value of B (and second value of A with second value of B and so on). This should happen for all columns of A and result in another dataframe, C.

A=pd.DataFrame(
[[1, 0, 0],
[3, 4, 0],
[6, 7, 8]])

B=np.array([1,2,3,4]).T

Here the result would be

C=[ 1,  0, 0,
    6,  4, 0,
   18, 14, 8]

I have made a for loop, where I can iterate through each value

for x in range(0,len(B)):
    C = (A.iloc[192+x:,:].T*B[0:len(B)-x]).T

However this is very slow, and I need to repeat this operation many times on different datasets. Is there a nice and pythonic way of vectorizing this?

Upvotes: 2

Answers (3)

Alicia Garcia-Raboso

Reputation: 13913

Use np.fromfunction to define a matrix whose entries are the multipliers that you want. For example, if

A = np.array([[1, 0, 0],
              [3, 4, 0],
              [6, 7, 8]])

then

B = np.clip(np.fromfunction(lambda i, j: i-j+1, A.shape), 0, None)

will give you

B = np.array([[1, 0, 0],
              [2, 1, 0],
              [3, 2, 1]])

and then the result you want is simply the elementwise product of A and B:

C = A * B

yields

C = np.array([[1,  0,  0],
              [6,  4,  0],
              [18, 14, 8]])

In fact, since your A is lower-triangular, you can drop the call to np.clip in the definition of B and obtain the same C.

Edit: I slightly misinterpreted the question. If the B in the OP (let me call it b, since I've already used B) is not the sequence of natural numbers, you can do

 B = np.tril(
         np.fromfunction(
             lambda i, j: b[np.clip((i-j).astype(int), 0, b.shape[0])],
             A.shape))

For example, if

 b = np.array([2, 3, 1, 4])

then you would get

 B = np.array([[2, 0, 0],
               [3, 2, 0],
               [1, 3, 2]])

Upvotes: 2

yatu

Reputation: 88236

Here's a way to do it:

You can create a lower triangular matrix from B, by trimming and zero padding the vector B over each column so that its upper triangular part are all zeros.

So in essence this way you are replicating a matrix multiplication operation. You then simply have to multiply element-wise the two matrices by either using A*new_B or np.multiply(A,new_B).

new_b = np.array([list(np.pad(B[:-i] if i != 0 else B,(i,0), 'constant')) 
                  for i in range(len(B))]).T[:len(A),:len(A)]

print(new_b)
array([[1, 0, 0],
       [2, 1, 0],
       [3, 2, 1]])

print(new_b*A)
array([[ 1,  0,  0],
       [ 6,  4,  0],
       [18, 14,  8]])

Upvotes: 1

Thomas Kimber

Reputation: 11067

OK, so how about creating an array from your B vector that matches the shape you're after? Once you've transformed it in that way, you could perform an element-wise multiplication, and all the right values will be aligned.

A = np.array([[1, 0, 0],
              [3, 4, 0],
              [6, 7, 8]])
B = np.array([1,2,3,4])

mB = B[:A.shape[0]]
shift = B[:A.shape[0]]
for b in range(0,A.shape[0]):
    shift = np.roll(shift ,1)
    mB = np.append(mB, shift)
mB.resize(A.shape)
np.tril(mB.T)

>>>> array([[1, 0, 0],
            [2, 1, 0],
            [3, 2, 1]])

In the above I force the top-right triangle to be zeros, but since your A vector has already got zeros in those positions, it doesn't really matter what values are going to be in those positions of the multiplying array - so the np.tril step is not really necessary.

Anyway, whatever your preference, once you've got that form (and there may well be a better way than the one used above to arrive at that form) then you can np.multiply the two objects which will multiply aligned elements.

np.multiply(A, np.tril(mB.T))

>>>> array([[ 1,  0,  0],
            [ 6,  4,  0],
            [18, 14,  8]])

Upvotes: 1

Multiply vector on dataframe - vectorized

Answers (3)

Related Questions