Using pandas index as dictionary key, fill dictionary with values based on matching keys

Question

I have a test_df organized like so:

[in]
# Use the arrays to create a dataframe
testing_df =pd.DataFrame(test_array,columns=['transaction_id','product_id'])

# Split the product_id's for the testing data
testing_df.set_index(['transaction_id'],inplace=True)

print(testing_df.head(n=5))

[out]
                     product_id
transaction_id                 
001                      (P01,)
002                  (P01, P02)
003             (P01, P02, P09)
004                  (P01, P03)
005             (P01, P03, P05)

I then performed some calculations on it and created a dictionary to store the results:

# Initialize a dictionary to store the matches
matches = {}

# Return the product combos values that are of the appropriate length and the strings match
for transaction_id,i in enumerate (testing_df['product_id']):
    recommendation = None
    recommended_count = 0

    for k, count in product_combos.items():
        k = list(k)
        if len(i)+1 == len(k) and count >= recommended_count:
            for product in i:
                if product in k: 
                    k.remove(product)
            if len(k) == 1:
                recommendation = k[0]
                recommended_count = count
    matches[transaction_id] = recommendation

print(matches)

[out]
{0: 'P09', 1: 'P09', 2: 'P06', 3: 'P09', 4: 'P09', 5: 'P09'}

The problem I have is that the keys of the matches dictionary should be 001,002,003,004,005 etc. - corresponding to the index of the test_df which is 001-100.

The second issue I have is that I would like to fill another dictionary (recommendations) with the keys being 001-100. I would like the fill the values from matches into this dict by matching the key-values.

andrew_reece · Accepted Answer

There are a couple of issues here. First, the order of the variables you're asking for from enumerate is switched - the integer counter comes first:

for i, entry in enumerate(values):
    ...

That's why the keys in the matches dict appear as integers.

Second, you still need to access the ith element of testing_df.index to get the transaction_id you're looking for. You can do this with the i from your (corrected) enumerate():

# sample data
transaction_id = ["001","002","003","004","005"]
product_id = {"product_id":[("P01",), ("P01", "P02"), ("P01", "P02", "P09"),
                            ("P01", "P03"), ("P01", "P03", "P05")]}
testing_df = pd.DataFrame(product_id, index=transaction_id)
testing_df.index.name = "transaction_id"

print(testing_df)
                     product_id
transaction_id                 
001                      (P01,)
002                  (P01, P02)
003             (P01, P02, P09)
004                  (P01, P03)
005             (P01, P03, P05)

matches = {}

for i, entry in enumerate(testing_df.product_id):

    # ... some computation ...

    transaction_id = testing_df.index[i]
    recommendation = entry[0] # just as an example
    matches[transaction_id] = recommendation

print(matches)
{'001': 'P01', '002': 'P01', '003': 'P01', '004': 'P01', '005': 'P01'}

Using pandas index as dictionary key, fill dictionary with values based on matching keys

Answers (1)

Related Questions