Reputation: 101
I am trying to iterate over a NumPy array to create a list of lists but the for loop used is appending to the list of lists in alphabetical order rather than by the order of occurrence.
Here is a portion of my NumPy array that I can use as an example:
tarifas = np.array([['Afganistán', '577.21', '0.9360168799091559', '1.01745744495737'],
['Albania', '5450.0', '1.1439867079655244', '0.9195410037811979'],
['Alemania', '49690', '1.0034542200895549', '0.9873874704432137'],
['Angola', '3670.0', '0.931103978746121', '1.162652536895962'],
['Antigua y Barbuda', '18170', '0.7795684991736309', '0.6399312443495023'],
['Arabia Saudita', '23490', '1.0573676413333202', '0.7477763277701148'],
['Argelia', '4650.0', '0.7969840140783656', '0.5123046862189027'],
['Argentina', '9050.0', '1.3647162509775996', '0.48274125735042017'],
['Armenia', '4450.0', '1.4545784506262867', '1.430465487479917'],
['Australia', '57200', '0.7293018985322222', '1.1744384938116095'],
['Austria', '52470', '1.2396562976033307', '0.8630735107719588'],
['Azerbaiyán', '4780.0', '0.9111186496911305','0.534268284966654']])
I want to create a list of lists using another list to iterate over which would have the specific name of the countries I need to find in the array, i.e.
list_countries = ["Angola", "Austria", "Argentina", "Albania", "Armenia"]
Notice how the list is not in alphabetical order, therefore the list of lists should respect this order. The output after iteration should be the following:
new_list_of_countries = [['Angola' '3670.0' '0.931103978746121' '1.162652536895962'],
['Austria' '52470' '1.2396562976033307' '0.8630735107719588'],
['Argentina' '9050.0' '1.3647162509775996' '0.48274125735042017'],
['Albania' '5450.0' '1.1439867079655244' '0.9195410037811979'],
['Armenia' '4450.0' '1.4545784506262867' '1.430465487479917']]
Here is the code I used:
tarifas_paises_escogidos = []
for i in tarifas:
for v in list_countries:
if str(v) in str(i):
tarifas_paises_escogidos.append(i)
print(np.array(tarifas_paises_escogidos))
Upvotes: 1
Views: 313
Reputation: 13697
Since the original NumPy array, tarifas
, is sorted alphabetically, you can use np.searchsorted
to get the indices corresponding to the list_countries
:
indices = np.searchsorted(tarifas[:, 0], list_countries)
print(indices)
# [ 3 10 7 1 8]
and then use fancy indexing (indexing arrays using arrays) to get the desired result:
result = tarifas[indices]
print(result)
# [['Angola' '3670.0' '0.931103978746121' '1.162652536895962']
# ['Austria' '52470' '1.2396562976033307' '0.8630735107719588']
# ['Argentina' '9050.0' '1.3647162509775996' '0.48274125735042017']
# ['Albania' '5450.0' '1.1439867079655244' '0.9195410037811979']
# ['Armenia' '4450.0' '1.4545784506262867' '1.430465487479917']]
For big arrays this vectorized approach should be much faster than the solution using Python's for-loops from the Chris's answer.
Upvotes: 0
Reputation: 29742
Using list comprehension with sorted
:
sorted([t for t in tarifas if t[0] in list_countries],
key=lambda x: list_countries.index(x[0]))
Output:
[['Angola', '3670.0', '0.931103978746121', '1.162652536895962'],
['Austria', '52470', '1.2396562976033307', '0.8630735107719588'],
['Argentina', '9050.0', '1.3647162509775996', '0.48274125735042017'],
['Albania', '5450.0', '1.1439867079655244', '0.9195410037811979'],
['Armenia', '4450.0', '1.4545784506262867', '1.430465487479917']]
One without using list comprehension:
tarifas_paises_escogidos = []
for t in tarifas:
# for v in list_countries: You don't need this
if t[0] in list_countries:
tarifas_paises_escogidos.append(t)
print(tarifas_paises_escogidos)
which yields filtered but unsorted:
[['Albania', '5450.0', '1.1439867079655244', '0.9195410037811979'],
['Angola', '3670.0', '0.931103978746121', '1.162652536895962'],
['Argentina', '9050.0', '1.3647162509775996', '0.48274125735042017'],
['Armenia', '4450.0', '1.4545784506262867', '1.430465487479917'],
['Austria', '52470', '1.2396562976033307', '0.8630735107719588']]
Then you sort (and do assign it back!):
tarifas_paises_escogidos = sorted(tarifas_paises_escogidos, key=lambda x: list_countries.index(x[0]))
which makes the above output.
Insight:
In the lambda
above, x
almost means nothing. It just means that what ever input lambda
gets, it is defined as x
, and used for indexing (i.e. x[0]
).
It is identical as:
def some_func(x):
return list_countries.index(x[0])
then used in sorted
:
tarifas_paises_escogidos = sorted(tarifas_paises_escogidos, key=some_func)
But you may often find defining a function for just one use case quite inefficient. That's when lambda
kicks in :).
Upvotes: 1