Karim.C
Karim.C

Reputation: 21

Advanced filtering of lists

import random
import xlwings as xw

from collections import Counter

wb = xw.Book('Test.xlsx')

sheet = xw.sheets.active

SKUs = sheet.range('A2:C693').value

list_of_prob = sheet.range('D2:D693').value
list_of_prob = [float(i) for i in list_of_prob]

SKUs = random.choices(SKUs, weights = list_of_prob, k=20)

for item in zip(SKUs):
         print (item)

I coded the following program (above) which outputs an order picking list of 20 items based on their probability:

(['Item91', 10.0, 1.0],)
(['Item482', 6.0, 15.0],)
(['Item533', 8.0, 17.0],)
(['Item63', 7.0, 2.0],)
(['Item50', 5.0, 5.0],)
(['Item14', 2.0, 2.0],)
(['Item145', 1.0, 6.0],)
(['Item225', 6.0, 9.0],)
(['Item23', 3.0, 2.0],)
(['Item33', 4.0, 2.0],)
(['Item47', 5.0, 4.0],)
(['Item88', 9.0, 4.0],)
(['Item8', 1.0, 4.0],)
(['Item1', 1.0, 1.0],)
(['Item13', 2.0, 2.0],)
(['Item21', 3.0, 1.0],)
(['Item86', 9.0, 3.0],)
(['Item205', 5.0, 6.0],)
(['Item1', 1.0, 1.0],)
(['Item67', 7.0, 4.0],)

Every item has two numbers that correspond to the aisle and slot in the aisle (in a warehouse). The aim now is to filter the list that all duplicate aisles are removed and only left with the furthest slot for the corresponding aisle.

Example: aisle 1 has four items to be picked. To calculate the order picker’s travel time (return routing policy), I only need the location of the furthest item. That would be slot 6 in aisle 1. Thus I want to filter all the aisle 1 duplicates and only keep ([1.0, 6.0],).

Thus for all aisles I want the following list:

From this:

([10.0, 1.0],)
([6.0, 15.0],)
([8.0, 17.0],)
([7.0, 2.0],)
([5.0, 5.0],)
([2.0, 2.0],)
([1.0, 6.0],)
([6.0, 9.0],)
([3.0, 2.0],)
([4.0, 2.0],)
([5.0, 4.0],)
([9.0, 4.0],)
([1.0, 4.0],)
([1.0, 1.0],)
([2.0, 2.0],)
([3.0, 1.0],)
([9.0, 3.0],)
([5.0, 6.0],)
([1.0, 1.0],)
([7.0, 4.0],)

To this:

([10.0, 1.0],)
([6.0, 15.0],)
([8.0, 17.0],)
([2.0, 2.0],)
([1.0, 6.0],)
([3.0, 2.0],)
([4.0, 2.0],)
([9.0, 4.0],)
([5.0, 6.0],)
([7.0, 4.0],)

I did manage to find a solution in excel. First removing all duplicates and then with the remaining duplicates look for the max value of the corresponding value. Is there a good way to achieve this kind of "advanced" filtering in Python?

Upvotes: 0

Views: 790

Answers (2)

Joshua R.
Joshua R.

Reputation: 2302

Here's a pandas-based solution as suggested by others above. It uses sql-like grouping.

import pandas as pd

datin = [['Item91', 10.0, 1.0],
         ['Item482', 6.0, 15.0],
         ['Item533', 8.0, 17.0],
         ['Item63', 7.0, 2.0],
         ['Item50', 5.0, 5.0],
         ['Item14', 2.0, 2.0],
         ['Item145', 1.0, 6.0],
         ['Item225', 6.0, 9.0],
         ['Item23', 3.0, 2.0],
         ['Item33', 4.0, 2.0],
         ['Item47', 5.0, 4.0],
         ['Item88', 9.0, 4.0],
         ['Item8', 1.0, 4.0],
         ['Item1', 1.0, 1.0],
         ['Item13', 2.0, 2.0],
         ['Item21', 3.0, 1.0],
         ['Item86', 9.0, 3.0],
         ['Item205', 5.0, 6.0],
         ['Item1', 1.0, 1.0],
         ['Item67', 7.0, 4.0]]

pd.DataFrame(datin, columns=['Item', 'Aisle', 'Slot']).groupby(by='Aisle', as_index=False)['Slot'].max().values.tolist()

Upvotes: 0

Blckknght
Blckknght

Reputation: 104712

Use a dictionary to keep track of which rows you've seen so far and what the maximum slot number for each row is:

results = {}
for item, row, slot in SKUs:
    if results.get(row, 0) < slot:
        results[row] = slot

Note that I didn't use zip like you are in your example code, since that seems to be pointlessly wrapping your data in 1-tuples that you don't need.

If you need a list of row, max-slot pairs at the end, use list(results.items())

Upvotes: 1

Related Questions