Reputation: 881
I have a test_df
organized like so:
[in]
# Use the arrays to create a dataframe
testing_df =pd.DataFrame(test_array,columns=['transaction_id','product_id'])
# Split the product_id's for the testing data
testing_df.set_index(['transaction_id'],inplace=True)
print(testing_df.head(n=5))
[out]
product_id
transaction_id
001 (P01,)
002 (P01, P02)
003 (P01, P02, P09)
004 (P01, P03)
005 (P01, P03, P05)
I then performed some calculations on it and created a dictionary to store the results:
# Initialize a dictionary to store the matches
matches = {}
# Return the product combos values that are of the appropriate length and the strings match
for transaction_id,i in enumerate (testing_df['product_id']):
recommendation = None
recommended_count = 0
for k, count in product_combos.items():
k = list(k)
if len(i)+1 == len(k) and count >= recommended_count:
for product in i:
if product in k:
k.remove(product)
if len(k) == 1:
recommendation = k[0]
recommended_count = count
matches[transaction_id] = recommendation
print(matches)
[out]
{0: 'P09', 1: 'P09', 2: 'P06', 3: 'P09', 4: 'P09', 5: 'P09'}
The problem I have is that the keys of the matches
dictionary should be 001,002,003,004,005
etc. - corresponding to the index of the test_df
which is 001-100.
The second issue I have is that I would like to fill another dictionary (recommendations
) with the keys being 001-100. I would like the fill the values from matches
into this dict by matching the key-values.
Upvotes: 2
Views: 2204
Reputation: 21274
There are a couple of issues here. First, the order of the variables you're asking for from enumerate
is switched - the integer counter comes first:
for i, entry in enumerate(values):
...
That's why the keys in the matches
dict appear as integers.
Second, you still need to access the i
th element of testing_df.index
to get the transaction_id
you're looking for. You can do this with the i
from your (corrected) enumerate()
:
# sample data
transaction_id = ["001","002","003","004","005"]
product_id = {"product_id":[("P01",), ("P01", "P02"), ("P01", "P02", "P09"),
("P01", "P03"), ("P01", "P03", "P05")]}
testing_df = pd.DataFrame(product_id, index=transaction_id)
testing_df.index.name = "transaction_id"
print(testing_df)
product_id
transaction_id
001 (P01,)
002 (P01, P02)
003 (P01, P02, P09)
004 (P01, P03)
005 (P01, P03, P05)
matches = {}
for i, entry in enumerate(testing_df.product_id):
# ... some computation ...
transaction_id = testing_df.index[i]
recommendation = entry[0] # just as an example
matches[transaction_id] = recommendation
print(matches)
{'001': 'P01', '002': 'P01', '003': 'P01', '004': 'P01', '005': 'P01'}
Upvotes: 2