Reputation: 1645
I have some words in a dictionary and according to these and some sentences I want to create a specific array.
words = {'a': array([ 1.78505888, -0.40040435, -0.2555062 ]), 'c': array([ 0.58101204, -0.23254054, -0.5700197 ]), 'b': array([ 1.17213122, 0.38232652, -0.78477569]), 'd': array([-0.07545012, -0.10094538, -0.98136142])}
sentences = [['a', 'c'], ['b', 'a', 'd'], ['d', 'c']]
What I want is to get an array with the first row being the the values of 'a' and 'c' stacked vertically.
The second row being the values of 'b' and 'a' stacked vertically.
And the third, the values of 'd' and 'c' stacked vertically.
I tried this:
np.vstack((words[word] for word in sentences[0]))
>>> array([[ 1.78505888, -0.40040435, -0.2555062 ],
[ 0.58101204, -0.23254054, -0.5700197 ]])
So this is my first row, but I'm not able to do this for 'sentences' using list comprehension (only for one).
EDIT : Basically what I'm trying to do is the following
first_row = np.vstack((words[word] for word in sentences[0]))
second_row = np.vstack((words[word] for word in sentences[1]))
third_row = np.vstack((words[word] for word in sentences[2]))
l = []
l.append(first_row)
l.append(second_row)
l.append(third_row)
print np.array(l)
>>> [[[ 1.78505888 -0.40040435 -0.2555062 ]
[ 0.58101204 -0.23254054 -0.5700197 ]]
[[ 1.17213122 0.38232652 -0.78477569]
[ 1.78505888 -0.40040435 -0.2555062 ]
[-0.07545012, -0.10094538, -0.98136142]]
[[-0.07545012 -0.10094538 -0.98136142]
[ 0.58101204 -0.23254054 -0.5700197 ]]]
Upvotes: 2
Views: 1356
Reputation: 221584
You can use np.searchsorted
to establish correspondence between the string keys of words
and the strings in each element of sentences
. Repeat this process for all elements in sentences
for the final result. Thus, we would have just one level of looping to solve it. The implementation would look like this -
K = words.keys()
sortidx = np.argsort(K)
V = np.vstack(words.values())[sortidx]
out = [V[np.searchsorted(K,S,sorter=sortidx)] for S in sentences]
Sample run -
In [122]: words
Out[122]:
{'a': array([ 1.78505888, -0.40040435, -0.2555062 ]),
'b': array([ 1.17213122, 0.38232652, -0.78477569]),
'c': array([ 0.58101204, -0.23254054, -0.5700197 ]),
'd': array([-0.07545012, -0.10094538, -0.98136142])}
In [123]: sentences
Out[123]: [['a', 'c'], ['b', 'a', 'd'], ['d', 'c']]
In [124]: K = words.keys()
...: sortidx = np.argsort(K)
...: V = np.vstack(words.values())[sortidx]
...: out = [V[np.searchsorted(K,S,sorter=sortidx)] for S in sentences]
...:
In [125]: out
Out[125]:
[array([[ 1.78505888, -0.40040435, -0.2555062 ],
[ 0.58101204, -0.23254054, -0.5700197 ]]),
array([[ 1.17213122, 0.38232652, -0.78477569],
[ 1.78505888, -0.40040435, -0.2555062 ],
[-0.07545012, -0.10094538, -0.98136142]]),
array([[-0.07545012, -0.10094538, -0.98136142],
[ 0.58101204, -0.23254054, -0.5700197 ]])]
Upvotes: 2