Input df A B a 23 b,c 34 d,e,%f 30 Goal df_dct = {'a':23,'b':34,'c':34,'d':'30','e':'30','f':30} The details as below: A as keys , B as values The values in A is string and some is grouped by ',' The keys comes from by spliting ',' , and should replace all '%' and space. Try I know using zip to get dict from two dataframes but could not handle spliting.

Reputation: 1754

get dict from dataframe which rows contains many values?

Input

df 

A         B
a         23
b,c       34
d,e,%f    30

Goal

df_dct = {'a':23,'b':34,'c':34,'d':'30','e':'30','f':30}

The details as below:

A as keys , B as values
The values in A is string and some is grouped by ','
The keys comes from by spliting ',' , and should replace all '%' and space.

Try

I know using zip to get dict from two dataframes but could not handle spliting.

Upvotes: 1

Answers (2)

Mayank Porwal

Reputation: 34086

You can use df.explode() for pandas >= 0.25 with df.to_dict():

In [32]: df.A = df.A.str.replace("%", "")
In [42]: df_dct = df.assign(var1=df['A'].str.split(',')).explode('var1').drop('A', 1).set_index('var1').to_dict()['B'] 

In [43]: df_dct
Out[43]: {'a': 23, 'b': 34, 'c': 34, 'd': 30, 'e': 30, 'f': 30}

Upvotes: 1

sammywemmy

Reputation: 28709

Remove the percentage from column A with str replace

df["A"] = df.A.str.replace("%", "")

Use itertools' product to get the pairing of each element in A and B for each row, then combine them into one list, using chain

from itertools import product, chain
#apply dict to get your final result
dict(chain.from_iterable((product(A.split(","),[B])) for A,B in df.to_numpy()))

{'a': 23, 'b': 34, 'c': 34, 'd': 30, 'e': 30, 'f': 30}

Upvotes: 0

get dict from dataframe which rows contains many values?

Answers (2)

Related Questions