MapPeddler
MapPeddler

Reputation: 53

Can't work out how to apply multi-keyed dictionary vals back to dataframe column

I have a dictionary with 2 keys for every 1 value like so:

Initial Dict

Key : ('106338', '2006-12-27') , Value : []

Dict after populating

Key : ('106338', '2006-12-27') , Value : [8, 7, 9, 8, 7]

The value for each key pair is an array holding some amount of information which I need the length of. I created this dictionary by first itertupling across a dataframe and generating key pairs and empty arrays for each unique record. I then iterated across it again and populated the arrays with the information I need by appending values to the end of each key pair. Key pairs were generated from row values. The first item in the key is the Identification number for the asset and the second item is the date for the asset. Here is code for dict creation:

perm_dict = {}
for row in df_perm.itertuples():
    perm_dict[str(row[1]),str(row[3])] = []

for row in df_perm.itertuples():
    if row[6].to_datetime().date() < row[9].to_datetime().date() and row[9].to_datetime().date() < row[5].to_datetime().date():
        perm_dict[str(row[1]), str(row[3])].append(row[10])

My problem is that I now need to call those values back via the key pairs by iterating through the original dataframe so I can take my array lengths and make a new column out of them. Screenshot of DataFrame:

enter image description here

I am having trouble working out a solution in my head for applying these counts back to the original dataframe as a new column for only the rows with key matches. I can't iterate back through to add them because then I'd be modifying my original DF and I've read that's a big no-no. Any help that you all may be able to provide would be greatly appreciated! Also please lmk if I need to include more information as I can provide more.

Edit1

Here are the outputs after running the dictionary comprehension code provided.

enter image description here

Upvotes: 0

Views: 43

Answers (1)

jpp
jpp

Reputation: 164773

This might be what you are looking for.

import pandas as pd

# sample data
d = {('106338', '2006-12-27'): [8, 7, 9, 8, 7]}
df = pd.DataFrame([['106338', '2006-12-27']], columns=['Key1', 'Key2'])

# first make dictionary mapping to length of list
d_len = {k: len(v) for k, v in d.items()}

# perform mapping
df['Len'] = list(map(d_len.get, (zip(*(df[col] for col in ('Key1', 'Key2'))))))

# output
# Key1     Key2    Len
# 106338 2006-12-27 5 

Upvotes: 1

Related Questions