Thomas Matthew
Thomas Matthew

Reputation: 2886

Pandas dataframe into sparse dictionary of dictionaries

How do I convert a pandas dataFrame into a sparse dictionary of dictionaries, where only the indexes of some cutoff are shown. In the toy example below, I only want indexes for each column whose values > 0

import pandas as pd

table1 = [['gene_a', -1 , 1], ['gene_b', 1, 1],['gene_c', 0, -1]]
df1 = pd.DataFrame(table)
df1.columns = ['gene','cell_1', 'cell_2']
df1 = df1.set_index('gene')
dfasdict = df1.to_dict(orient='dict')

This gives:

dfasdict = {'cell_1': {'gene_a': -1, 'gene_b': 0, 'gene_c': 0}, 'cell_2': {'gene_a': 1, 'gene_b': -1, 'gene_c': -1}}

But the desired output is a sparse dictionary, where only values less than zero are shown:

desired = {'cell_1': {'gene_a': -1}, 'cell_2': {'gene_b': -1, 'gene_c': -1}}

I can do some processing to change the dfasdict dictionary after creation, but I want to do the conversion in the same step since processing afterwards involves iterating over very large dictionaries. Is this possible to do all within pandas?

Upvotes: 6

Views: 419

Answers (2)

Alexander
Alexander

Reputation: 109520

This result uses a dictionary comprehension to generate the result. For each column in cell_1 and cell_2, it finds those that are less than (lt) zero and converts the result to a dictionary.

>>> {col: df1.loc[df1[col].lt(0), col].to_dict() for col in ['cell_1', 'cell_2']}
{'cell_1': {'gene_a': -1}, 'cell_2': {'gene_c': -1}}

To help understand what is going on here:

>>> df1.loc['cell_1'].lt(0)
gene
gene_a     True
gene_b    False
gene_c    False
Name: cell_1, dtype: bool

>>> df1.loc[df1['cell_1'].lt(0), 'cell_1'].to_dict()
{'gene_a': -1}

Upvotes: 2

su79eu7k
su79eu7k

Reputation: 7306

Delete last row of your code and add this one.

from pandas import compat

def to_dict_custom(data):
    return dict((k, v[v<0].to_dict()) for k, v in compat.iteritems(data))

dfasdict = to_dict_custom(df1)
print dfasdict

which yields,

{'cell_2': {'gene_c': -1.0}, 'cell_1': {'gene_a': -1.0}}

line 3&4 inspired by here please check.

Upvotes: 1

Related Questions