Delosari
Delosari

Reputation: 693

How to find in a parent string list, the indexes corresponding to a child string list

I am writing a code which reads data from a text file. I load the data using numpy loadtxt and it could look like something like this:

import numpy as np

Shop_Products  = np.array(['Tomatos', 'Bread' , 'Tuna', 'Milk', 'Cheese'])
Shop_Inventory = np.array([12, 6, 10, 7, 8])

I want to check some of the products I have:

Shop_Query     = np.array(['Cheese', 'Bread']

Now I would like to find these "items" indeces in the Shop_Products array without doing a for loop and if checks.

I wondered if it could be done with any of the numpy methods: I thought of using a intercept1d to find the common items and then use searchsorted. However, I cannot sort my "Products" list since I do not want to loose the original sorting (for example I would use the indexes to directly look for the inventory of each product).

Any advice on the "pythonish" solution?

Upvotes: 4

Views: 1033

Answers (2)

Jaime
Jaime

Reputation: 67427

np.searchsorted can take a sorting permutation as an optional argument:

>>> sorter = np.argsort(Shop_Products)
>>> sorter[np.searchsorted(Shop_Products, Shop_Query, sorter=sorter)]
array([4, 1])
>>> Shop_Inventory[sorter[np.searchsorted(Shop_Products, Shop_Query, sorter=sorter)]]
array([8, 6])

This is probably faster than np.in1d, which also needs to sort the array. It also returns values in the same order as they come up in Shop_Query, while np.1d will return the values in the order they come up in Shop_Products, regardless of the ordering in the query:

>>> np.in1d(Shop_Products, ['Cheese', 'Bread']).nonzero()
(array([1, 4]),)
>>> np.in1d(Shop_Products, ['Bread', 'Cheese']).nonzero()
(array([1, 4]),)

Upvotes: 9

Alex Riley
Alex Riley

Reputation: 176750

You can use in1d() and nonzero() to find the indices of the items in Shop_Products:

>>> np.in1d(Shop_Products, Shop_Query).nonzero()
(array([1, 4]),)

(in1d returns a boolean array indicating whether an item is in the second list, nonzero returns the indices of the True values.)

To look up the corresponding values in Shop_Inventory, use this result to index the array:

>>> i = np.in1d(Shop_Products, Shop_Query).nonzero()
>>> Shop_Inventory[i]
array([6, 8])

Upvotes: 3

Related Questions