Dictionary split into multiple lines per key

Question

I have troubles splitting a string in a dictionary into multiple lines in a DataFrame for one key. So far i couldn't find a proper solution. Any help is appreciated.

The following code can split the string into one line:

d_new = {k: dict(map(str.strip, x.split('||')) for x in v) for k, v in d.items()}

df = pd.DataFrame.from_dict(d_new, orient='index')

My dictionary d looks like this :

{'Key1': ['A||1234', 'A||1235', 'A||1236', 'B||4567', 'C||78910'],
 'Key2': ['A||165', 'A||135', 'B||888', 'B||1111']}

I want to split such that Key1 has 3 lines (for the three different arguments for A) and Key2 has 2 lines.

Desired Output:

Key|A|B|C
Key1|1234|4567|78910
Key1|1235|4567|78910
Key1|1236|4567|78910
Key2|165|888|
Key2|135|1111|

Edit1: I'm sorry, i dont know how to make a table here. I added the desired output as good as i could.

jpp · Accepted Answer

The problem is you need to construct a dataframe for each dictionary list of values. Here's a solution using collections.defaultdict:

d = {'Key1': ['A||1234', 'A||1235', 'A||1236', 'B||4567', 'C||78910'],
     'Key2': ['A||165', 'A||135', 'B||888', 'B||1111']}

from collections import defaultdict

def create_dataframe(k, x):
    dd = defaultdict(list)
    for item in x:
        key, value = item.split('||')
        dd[key].append(value)
    return pd.DataFrame.from_dict(dd, orient='index').T.assign(Key=k).ffill()

df = pd.concat(create_dataframe(*item) for item in d.items())

print(df)

      A     B      C   Key
0  1234  4567  78910  Key1
1  1235  4567  78910  Key1
2  1236  4567  78910  Key1
0   165   888    NaN  Key2
1   135  1111    NaN  Key2

Dictionary split into multiple lines per key

Answers (1)

Related Questions