dooms
dooms

Reputation: 1645

Create array from dict

I have some words in a dictionary and according to these and some sentences I want to create a specific array.

words = {'a': array([ 1.78505888, -0.40040435, -0.2555062 ]), 'c': array([ 0.58101204, -0.23254054, -0.5700197 ]), 'b': array([ 1.17213122,  0.38232652, -0.78477569]), 'd': array([-0.07545012, -0.10094538, -0.98136142])}

sentences = [['a', 'c'], ['b', 'a', 'd'], ['d', 'c']]

What I want is to get an array with the first row being the the values of 'a' and 'c' stacked vertically.
The second row being the values of 'b' and 'a' stacked vertically.
And the third, the values of 'd' and 'c' stacked vertically.

I tried this:

np.vstack((words[word] for word in sentences[0]))
>>> array([[ 1.78505888, -0.40040435, -0.2555062 ],
   [ 0.58101204, -0.23254054, -0.5700197 ]])

So this is my first row, but I'm not able to do this for 'sentences' using list comprehension (only for one).

EDIT : Basically what I'm trying to do is the following

first_row = np.vstack((words[word] for word in sentences[0]))
second_row = np.vstack((words[word] for word in sentences[1]))
third_row = np.vstack((words[word] for word in sentences[2]))

l = []
l.append(first_row)
l.append(second_row)
l.append(third_row)

print np.array(l)
>>> [[[ 1.78505888 -0.40040435 -0.2555062 ]
      [ 0.58101204 -0.23254054 -0.5700197 ]]

     [[ 1.17213122  0.38232652 -0.78477569]
      [ 1.78505888 -0.40040435 -0.2555062 ]
      [-0.07545012, -0.10094538, -0.98136142]]

     [[-0.07545012 -0.10094538 -0.98136142]
      [ 0.58101204 -0.23254054 -0.5700197 ]]]

Upvotes: 2

Views: 1356

Answers (1)

Divakar
Divakar

Reputation: 221584

You can use np.searchsorted to establish correspondence between the string keys of words and the strings in each element of sentences. Repeat this process for all elements in sentences for the final result. Thus, we would have just one level of looping to solve it. The implementation would look like this -

K = words.keys()
sortidx = np.argsort(K)
V = np.vstack(words.values())[sortidx]
out = [V[np.searchsorted(K,S,sorter=sortidx)] for S in sentences]

Sample run -

In [122]: words
Out[122]: 
{'a': array([ 1.78505888, -0.40040435, -0.2555062 ]),
 'b': array([ 1.17213122,  0.38232652, -0.78477569]),
 'c': array([ 0.58101204, -0.23254054, -0.5700197 ]),
 'd': array([-0.07545012, -0.10094538, -0.98136142])}

In [123]: sentences
Out[123]: [['a', 'c'], ['b', 'a', 'd'], ['d', 'c']]

In [124]: K = words.keys()
     ...: sortidx = np.argsort(K)
     ...: V = np.vstack(words.values())[sortidx]
     ...: out = [V[np.searchsorted(K,S,sorter=sortidx)] for S in sentences]
     ...: 

In [125]: out
Out[125]: 
[array([[ 1.78505888, -0.40040435, -0.2555062 ],
        [ 0.58101204, -0.23254054, -0.5700197 ]]),
 array([[ 1.17213122,  0.38232652, -0.78477569],
        [ 1.78505888, -0.40040435, -0.2555062 ],
        [-0.07545012, -0.10094538, -0.98136142]]),
 array([[-0.07545012, -0.10094538, -0.98136142],
        [ 0.58101204, -0.23254054, -0.5700197 ]])]

Upvotes: 2

Related Questions