Async
Async

Reputation: 29

How to select n columns from a matrix minimizing a given function

I must buy one of each product, but I can visit no more then n shops. Which n shops should I choose to spend the least amount of money? Products are not divisible, every shop have full inventory.

Shop A Shop B Shop C
Product 1 $10.00 $12.00 $9.99
Product 2 $8.50 $9.99 $7.99
Product 3 $15.00 $14.50 $16.99

So I need to minimize df.min(axis=1).sum() , where df represents any combination of n columns.

The brute force algorithm would be:

import pandas as pd
from itertools import combinations

df = pd.DataFrame({'A': [1,3,4], 'B': [3,1,3], 'C': [2,3,1]})
best_so_far = df.iloc[:, 0]
n = 2
for combo in  combinations(df.columns, n):
    selection = df[list(combo)].min(axis=1)
    if selection.sum() < best_so_far.sum():
        best_so_far = selection

The goal is to improve time complexity. Using greedy approach, or dynamic programming doesn't work here, as there is no optimal subproblem. Sorting columns also doesn't help, because a shop (with half of its products prohibitively expensive and the other half almost free) can have a biggest total sum of its products, but still be the best candidate.

Upvotes: 0

Views: 127

Answers (1)

Async
Async

Reputation: 29

This code solves the problem. Sadly I am not interested in the output itself, but in complexity. How many steps should I do when trying to simulate underlying algorithm on paper?

import pulp

# You could formulate this as in integer linear program with binary variables:
problem = pulp.LpProblem("Product Purchase Problem", pulp.LpMinimize)

# Define the decision variables
stores = ["Store 1", "Store 2", "Store 3"]
products = ["Product 1", "Product 2", "Product 3"]

# Xij = 1 if product j is purchased at store i, else = 0;
X = pulp.LpVariable.dicts("X", [(i, j) for i in stores for j in products], cat="Binary")
# Si = 1 if store i is available, else = 0.
S = pulp.LpVariable.dicts("S", stores, cat="Binary")

# Define the objective function
C = {
    ("Store 1", "Product 1"): 1,
    ("Store 1", "Product 2"): 3,
    ("Store 1", "Product 3"): 4,
    ("Store 2", "Product 1"): 3,
    ("Store 2", "Product 2"): 1,
    ("Store 2", "Product 3"): 3,
    ("Store 3", "Product 1"): 2,
    ("Store 3", "Product 2"): 3,
    ("Store 3", "Product 3"): 1,
}

# Minimize Σij CijXij, where Cij is the cost of purchasing product j from store i
problem += pulp.lpSum([C[(i, j)] * X[(i, j)] for i in stores for j in products])

# Subject to constraints
for j in products:
# Σi Xij = 1 for all products j (each product is purchased at some store)
    problem += pulp.lpSum([X[(i, j)] for i in stores]) == 1
    for i in stores:
# Xij <= Si for all i,j (can only purchase at store i if store i is available)
        problem += pulp.lpSum(X[(i, j)]) <= S[i]

# Subject to constraint Σi Si <= n (at most n stores available)
problem += pulp.lpSum([S[i] for i in stores]) <= 2

# Solve the problem
problem.solve()

# Print the results
for v in problem.variables():
    print(v.name, "=", v.varValue)
print("Total Cost =", pulp.value(problem.objective))

The output:

Result - Optimal solution found

Objective value: 4.00000000

Enumerated nodes: 0

Total iterations: 0

Time (CPU seconds): 0.00

Time (Wallclock seconds): 0.00

Option for printingOptions changed from normal to all

Total time (CPU seconds): 0.00 (Wallclock seconds): 0.00

S_Store_1 = 0.0

S_Store_2 = 1.0

S_Store_3 = 1.0

X_('Store_1',_'Product_1') = 0.0

X_('Store_1',_'Product_2') = 0.0

X_('Store_1',_'Product_3') = 0.0

X_('Store_2',_'Product_1') = 0.0

X_('Store_2',_'Product_2') = 1.0

X_('Store_2',_'Product_3') = 0.0

X_('Store_3',_'Product_1') = 1.0

X_('Store_3',_'Product_2') = 0.0

X_('Store_3',_'Product_3') = 1.0

Total Cost = 4.0

Upvotes: 1

Related Questions