Reputation: 615
Say I have a list which contains 16 elements:
lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']
This list represents a 4 x 4 array where all elements have been put into a 1D list. In its array form it has this form:
'A', 'B', 'C', 'D'
'E', 'F', 'G', 'H'
'I', 'J', 'K', 'L'
'M', 'N', 'O', 'P'
I want to extract a sub-matrix from this 1D list as another 1D list which always starts at the first element.
e.g. Extracting a 2 x 2 matrix from lst:
'A', 'B', 'E', 'F'
Or extracting a 3 x 3 matrix from lst:
'A', 'B', 'C', 'E', 'F', 'G', 'I', 'J', 'K'
To achieve this I use numpy to resize the list into an array, extract the submatrix and then flatten back down again:
import numpy as np
# The size of the matrix represented by lst
init_mat = 4
# Desired matrix size to extract
mat_size = 2
A = np.resize(lst,(init_mat,init_mat))
B = A[0:mat_size, 0:mat_size].flatten()
C = map(str,B)
This works but I was wondering if there was a more pythonic way to do this, as I do not think this method will scale well with matrix size.
Upvotes: 3
Views: 701
Reputation: 1003
Doing this without numpy and considering when matrix gets large, I would use iterator to walk the list so no extra lists are created during the extraction. Utilizing islice
to fetch required items, it would chop out the items it needed with each slicing operation. In the case of extracting 3x3 matrix, the first slice will start from index 0 and stop before index 3, thus chop out the first three items from the iterator. The following slices would start at index 1 because 4 - 3 = 1, and stop before 4.
from itertools import chain, islice, repeat
lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']
width = 4
extract = [3, 3]
slice_starts = chain([0], repeat(width - extract[0]))
slice_stops = chain([extract[0]], repeat(width))
rows = map(islice, repeat(iter(lst), extract[1]), slice_starts, slice_stops)
print(list(chain.from_iterable(rows)))
Or you could take first three items out of every 4 items using compress
from itertools import chain, compress, repeat
lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']
width = 4
extract = [3, 3]
selectors = repeat([i < extract[0] for i in range(width)], extract[1])
print(list(compress(lst, chain.from_iterable(selectors))))
Upvotes: 1
Reputation: 221534
One array based approach would be -
size = 2 # or 3 or any number <= 4
np.asarray(lst).reshape(4,4)[:size,:size].ravel()
Sample run -
In [55]: lst=['A','B','C','D','E','F','G','H','I','J','K','L','M','N','O','P']
In [56]: size=2
In [57]: np.asarray(lst).reshape(4,4)[:size,:size].ravel()
Out[57]:
array(['A', 'B', 'E', 'F'],
dtype='|S1')
In [58]: size=3
In [59]: np.asarray(lst).reshape(4,4)[:size,:size].ravel()
Out[59]:
array(['A', 'B', 'C', 'E', 'F', 'G', 'I', 'J', 'K'],
dtype='|S1')
If you want a 2D
array, skip the ravel()
part.
If you want to have a list as output, we need an additional step of .tolist()
being appended to the output.
If you want to avoid converting the entire list to an array, maybe because the number of elements is too large and the window to be extracted is relatively smaller, we can just generate the valid indices for the block with some help from NumPy broadcasting
. Then, index into input list with it for the final output as a list. Thus, we will end up with something like this -
idx = (np.arange(size)[:,None]*4 + np.arange(size)).ravel()
out = [lst[i] for i in idx]
Upvotes: 2
Reputation: 249133
Calling flatten()
then map()
is less efficient than:
B = A[:mat_size, :mat_size].reshape(-1)
C = B.tolist()
This avoids some copies and unnecessary function calls.
For more on reshape()
vs flatten()
, see: What is the difference between flatten and ravel functions in numpy?
You can also do it without NumPy at all. In a way this is simpler. You'd need to test with your specific input data to see which is faster.
[lst[i*init_mat + j] for i in range(mat_size) for j in range(mat_size)]
Upvotes: 4