DPdl
DPdl

Reputation: 755

Dataframe to a dictionary

I have a very large dataframe, a sample of which looks like this:

df = pd.DataFrame({'From':['a','b','c','a','d'], 'To':['b', 'c', 'a', 'd', 'e'], 'Rates':[1e-4, 2.3e-2, 1e-2, 100, 70]})

In[121]: df
Out[121]: 
  From To     Rates
0    a  b    0.0001
1    b  c    0.0230
2    c  a    0.0100
3    a  d  100.0000
4    d  e   70.0000

The end result I would like is a dictionary that looks like this:

{('a', 'b'): 0.0001,
 ('a', 'd'): 100.0,
 ('b', 'c'): 0.023,
 ('c', 'a'): 0.01,
 ('d', 'e'): 70.0}

The following code works but it is very inefficient for a large df.

from_comps = list(df['From'])
to_comps = list(df['To'])
transfer_rates = {}
for from_comp in from_comps:
    for to_comp in to_comps:
        try:
            transfer_rates[from_comp, to_comp] = df.loc[(df['From'] == from_comp) & (df['To'] == to_comp)]['Rates'].values[0]
        except:
            pass

Is there a more efficient way of doing this?

Upvotes: 0

Views: 41

Answers (3)

gold_cy
gold_cy

Reputation: 14216

We can also use the to_records method to get the desired results.

{(item.From, item.To): item.Rates for item in df.to_records(index=False)}

{('a', 'b'): 0.0001,
 ('b', 'c'): 0.023,
 ('c', 'a'): 0.01,
 ('a', 'd'): 100.0,
 ('d', 'e'): 70.0}

Upvotes: 2

G. Anderson
G. Anderson

Reputation: 5955

Given the input provided, it's far simpler to use the built-in to_dict() method. Note that for a more complex dataset, this might require more tweaking.

df = pd.DataFrame({'From':['a','b','c','a','d'], 'To':['b', 'c', 'a', 'd', 'e'], 'Rates':[1e-4, 2.3e-2, 1e-2, 100, 70]})

df.set_index(['From','To']).to_dict()

{'Rates': {('a', 'b'): 0.0001,
  ('b', 'c'): 0.023,
  ('c', 'a'): 0.01,
  ('a', 'd'): 100.0,
  ('d', 'e'): 70.0}}

df.set_index(['From','To']).to_dict()['Rates']

{('a', 'b'): 0.0001,
 ('b', 'c'): 0.023,
 ('c', 'a'): 0.01,
 ('a', 'd'): 100.0,
 ('d', 'e'): 70.0}

Upvotes: 3

Kenan
Kenan

Reputation: 14094

You could use df.to_dict and pivot_table

df['key'] = list(zip(df['From'], df['To']))
df[['key', 'Rates']].pivot_table(columns='key', values='Rates').to_dict()

{('a', 'b'): {'Rates': 0.0001}, ('a', 'd'): {'Rates': 100.0}, ('b', 'c'): {'Rates': 0.023}, ('c', 'a'): {'Rates': 0.01}, ('d', 'e'): {'Rates': 70.0}}

Upvotes: 0

Related Questions