Georg Heiler
Georg Heiler

Reputation: 17676

Pandas nested dict to dataframe

I have a simple nested list like this:

allFrame= [{'statValues': {'kpi2': 2, 'kpi1': 1}, 'modelName': 'first'},{'statValues': {'kpi2': 4, 'kpi1': 2}, 'modelName': 'second'}, {'statValues': {'kpi2': 3, 'kpi1': 3}, 'modelName': 'third'}]

a pd.DataFrame(allFrame) or pd.DataFrame.from_dict(allFrame)both do not really work and only return json

How can I instead receive the kpi_X as column-names?

I found Python dict to DataFrame Pandas doing something similar. However, I believe this operation should be simpler

Upvotes: 2

Views: 2340

Answers (2)

Bonus Luo
Bonus Luo

Reputation: 1

pd.DataFrame(list(allFrame.items()),columns=['modelName','statVlues'])

Upvotes: 0

OmerBA
OmerBA

Reputation: 842

Looks like you need to flatten those dicts first.

Apply a flattening function on the list first:

def flatten_dict(d, prefix='__'):
    def items():
        # A clojure for recursively extracting dict like values
        for key, value in d.items():
            if isinstance(value, dict):
                for sub_key, sub_value in flatten_dict(value).items():
                    # Key name should imply nested origin of the dict,
                    # so we use a default prefix of __ instead of _ or .
                    yield key + prefix + sub_key, sub_value
            else:
                yield key, value
    return dict(items())

Also note the use of orient=records, meaning each dict in the list is a line in the dataframe.

So:

l = list(map(flatten_dict, allFrame))
df = pd.DataFrame.from_dict(l, orient='records')

Upvotes: 2

Related Questions