Reputation: 47
I have a problem I don't know how to describe it so you understand. I am going to give an example. Let's say we have this array (B) in python:
[[ 1 1]
[ 7 11]
[1 20]
[20 1]
[26 11]
[31 11]]
The first column represents the users. The second the tags. Now, I want to create a matrix who will have "1s" where edges exist otherwise "0s". We have 5 and 4 different users and tags respectevily, that is a 6*5 matrix.. If I write:
zero = np.zeros((6,5,).astype(int) #it needs one more row and column
for line in B:
if line[2]:
zero[line[0],line[1]] = 1
the error is:
zero[line[0],line[1]] = 1
IndexError: index 7 is out of bounds for axis 0 with size 7
Ok, how can I make the combination between two matrices because I want the element "31" to be the fifth row and element "11" the fourth column.
Upvotes: 1
Views: 182
Reputation:
Use pandas and numpy
>>>import numpy as np
>>>import pandas as pd
>>> tagsArray = np.unique([1,11,20,1,11,11])
>>> userArray = np.unique([1,7,20,26,31])
>>> aa = [[ 1,1],[ 7, 11],[1, 20],[20, 1],[26, 11],[31, 11]]
>>> df = pd.DataFrame(index=userArray,columns=tagsArray)
>>> for s in aa:
... df.loc[s[0],s[1]] = 1
...
>>> df.fillna(0,inplace=True)
>>> df
1 11 20
1 1 NaN 1
7 NaN 1 NaN
20 1 NaN NaN
26 NaN 1 NaN
31 NaN 1 NaN
Upvotes: 3
Reputation: 221534
Staying close to your initial attempt, listed below is a NumPy based approach. We can use np.unique(..,return_inverse=1)
for those two columns to give us unique IDs that could be used as row and column indices respectively for indexing into the output. Thereafter, we would simply initialize the output array and index into it to give us the desired result.
Thus, an implementation would be -
r,c = [np.unique(i,return_inverse=1)[1] for i in B.T]
out = np.zeros((r.max()+1,c.max()+1),dtype=int)
out[r,c] = 1
Alternatively, a more explicit way to get r
and c
would be like so -
r = np.unique(B[:,0],return_inverse=1)[1]
c = np.unique(B[:,1],return_inverse=1)[1]
Sample input, output -
In [27]: B # Input array
Out[27]:
array([[ 1, 1],
[ 7, 11],
[ 1, 20],
[20, 1],
[26, 11],
[31, 11]])
In [28]: out # Output
Out[28]:
array([[1, 0, 1],
[0, 1, 0],
[1, 0, 0], r = np.unique(B[:,0],return_inverse=1)[1]
c = np.unique(B[:,1],return_inverse=1)[1]
[0, 1, 0],
[0, 1, 0]])
Upvotes: 0