Mukesh Nayal
Mukesh Nayal

Reputation: 21

Can we make a table from the nested dictionary with the use of Pandas?

Question: How to print student name by maths mark descending?

[
    {
        "name": "student2",
        "age": 30,
        "is_male": False,
        "marks": {"science": 56, "maths": 32, "english": 67},
        "total_marks": 155,
    },
    {
        "name": "student4",
        "age": 28,
        "is male": False,
        "marks": {"science": 78, "maths": 55, "english": 98},
        "total_marks": 230,
    },
    {
        "name": "student3",
        "age": 25,
        "is_male": True,
        "marks": {"science": 89, "maths": 56, "english": 99},
        "total_marks": 244,
    },
    {
        "name": "student1",
        "age": 23,
        "is_male": True,
        "marks": {"science": 95, "maths": 89, "english": 95},
        "total_marks": 279,
    },
]

Upvotes: 0

Views: 66

Answers (2)

Scott Boston
Scott Boston

Reputation: 153460

Try:

df = pd.DataFrame(
    [
        {
            "name": "student2",
            "age": 30,
            "is_male": False,
            "marks": {"science": 56, "maths": 32, "english": 67},
            "total_marks": 155,
        },
        {
            "name": "student4",
            "age": 28,
            "is male": False,
            "marks": {"science": 78, "maths": 55, "english": 98},
            "total_marks": 230,
        },
        {
            "name": "student3",
            "age": 25,
            "is_male": True,
            "marks": {"science": 89, "maths": 56, "english": 99},
            "total_marks": 244,
        },
        {
            "name": "student1",
            "age": 23,
            "is_male": True,
            "marks": {"science": 95, "maths": 89, "english": 95},
            "total_marks": 279,
        },
    ]
)

indx = (
    df.explode("marks")
    .query('marks == "maths"')
    .sort_values("total_marks", ascending=False)
    .index
)

#or more simply:

indx = df['marks'].str['maths'].sort_values(ascending=False).index

df.reindex(indx)

Output:

       name  age is_male                                        marks  total_marks is male
3  student1   23    True  {'science': 95, 'maths': 89, 'english': 95}          279     NaN
2  student3   25    True  {'science': 89, 'maths': 56, 'english': 99}          244     NaN
1  student4   28     NaN  {'science': 78, 'maths': 55, 'english': 98}          230   False
0  student2   30   False  {'science': 56, 'maths': 32, 'english': 67}          155     NaN

Upvotes: 3

JonSG
JonSG

Reputation: 13067

I think the easiest way to do this is to use json_normalize() though there is also nothing wrong with @scott-boston's answer.

import pandas
data = [
    {'name': 'student2', 'age': 30, 'is_male': False, 'marks': {'science': 56, 'maths': 32, 'english': 67}, 'total_marks': 155},
    {'name': 'student4', 'age': 28, 'is male': False, 'marks': {'science': 78, 'maths': 55, 'english': 98}, 'total_marks': 230},
    {'name': 'student3', 'age': 25, 'is_male': True, 'marks': {'science': 89, 'maths': 56, 'english': 99}, 'total_marks': 244},
    {'name': 'student1', 'age': 23, 'is_male': True, 'marks': {'science': 95, 'maths': 89, 'english': 95}, 'total_marks': 279}
]
print(pandas.json_normalize(data).sort_values("marks.maths", ascending=False))

Breaking out the nested dict so you can sort on it.

This gives you:

       name  age is_male  total_marks  marks.science  marks.maths  marks.english is male
3  student1   23    True          279             95           89             95     NaN
2  student3   25    True          244             89           56             99     NaN
1  student4   28     NaN          230             78           55             98   False
0  student2   30   False          155             56           32             67     NaN

Upvotes: 1

Related Questions