Angelika
Angelika

Reputation: 47

python - combination between elements of two matrices

I have a problem I don't know how to describe it so you understand. I am going to give an example. Let's say we have this array (B) in python:

[[ 1  1]
 [ 7 11]
 [1 20]
 [20 1]
 [26 11]
 [31 11]]

The first column represents the users. The second the tags. Now, I want to create a matrix who will have "1s" where edges exist otherwise "0s". We have 5 and 4 different users and tags respectevily, that is a 6*5 matrix.. If I write:

zero = np.zeros((6,5,).astype(int) #it needs one more row and column
for line in B:
 if line[2]:
    zero[line[0],line[1]] = 1

the error is:

   zero[line[0],line[1]] = 1

IndexError: index 7 is out of bounds for axis 0 with size 7

Ok, how can I make the combination between two matrices because I want the element "31" to be the fifth row and element "11" the fourth column.

Upvotes: 1

Views: 182

Answers (2)

user7330712
user7330712

Reputation:

Use pandas and numpy

>>>import numpy as np
>>>import pandas as pd
>>> tagsArray = np.unique([1,11,20,1,11,11])
>>> userArray = np.unique([1,7,20,26,31])
>>> aa = [[ 1,1],[ 7, 11],[1, 20],[20, 1],[26, 11],[31, 11]]
>>> df = pd.DataFrame(index=userArray,columns=tagsArray)
>>> for s in aa:
...     df.loc[s[0],s[1]] = 1
...
>>> df.fillna(0,inplace=True)
>>> df
     1    11   20
1     1  NaN    1
7   NaN    1  NaN
20    1  NaN  NaN
26  NaN    1  NaN
31  NaN    1  NaN

Upvotes: 3

Divakar
Divakar

Reputation: 221534

Staying close to your initial attempt, listed below is a NumPy based approach. We can use np.unique(..,return_inverse=1) for those two columns to give us unique IDs that could be used as row and column indices respectively for indexing into the output. Thereafter, we would simply initialize the output array and index into it to give us the desired result.

Thus, an implementation would be -

r,c = [np.unique(i,return_inverse=1)[1] for i in B.T]
out = np.zeros((r.max()+1,c.max()+1),dtype=int)
out[r,c] = 1

Alternatively, a more explicit way to get r and c would be like so -

r = np.unique(B[:,0],return_inverse=1)[1]
c = np.unique(B[:,1],return_inverse=1)[1]

Sample input, output -

In [27]: B  # Input array
Out[27]: 
array([[ 1,  1],
       [ 7, 11],
       [ 1, 20],
       [20,  1],
       [26, 11],
       [31, 11]])

In [28]: out  # Output
Out[28]: 
array([[1, 0, 1],
       [0, 1, 0],
       [1, 0, 0],    r = np.unique(B[:,0],return_inverse=1)[1]
c = np.unique(B[:,1],return_inverse=1)[1]
       [0, 1, 0],
       [0, 1, 0]])

Upvotes: 0

Related Questions