MoondaKamina
MoondaKamina

Reputation: 35

Count of occurrences while including other information

In Python, I have a list of dictionaries. I want to count occurrences of each name, but I also want to merge the other data with it.

Here is the input

task_list = [
    {
        "name": "user1",
        "email": "[email protected]",
        "task": "12121"
    },
    {
        "name": "user2",
        "email": "[email protected]",
        "task": "13131"
    },
    {
        "name": "user1",
        "email": "[email protected]",
        "task": "14141"
    }
]

Expected output

[
    {
        "name": "user1",
        "email": "[email protected]",
        "task": ["12121", "14141"],
        "count": 2
    },
    {
        "name": "user2",
        "email": "[email protected]",
        "task": ["13131"],
        "count": 1
    }
]

I currently only get the count for each user but I am lost on how I can merge other information with the count.

Here is my code

counts = {}
for task in task_list:
    if task["name"] not in counts.keys():
        counts[task["name"]] = 1
    else:
        counts[task["name"]] +=1

Here is the current output I am getting so far

[
    {
        "name": "user1",
        "count": 2
    },
    {
        "name": "user2",
        "count": 1
    }
]

All I can think of is looping 5-6 times and generate the expected output. But I think that is a bad solution. Are there any packages or some other solution to accomplish this?

Upvotes: 0

Views: 63

Answers (2)

James_Carno
James_Carno

Reputation: 11

This code has done it for me:

i = 0
for d1 in task_list:
    i += 1
    for d2 in task_list[i:]:
        if d1["name"] == d2["name"]:
            if isinstance(d1["task"], list):
                d1["task"] =  d1["task"] + [d2["task"]]
                task_list.remove(d2)
            else:
                d1["task"] = [d1["task"], d2["task"]]
                task_list.remove(d2)

Upvotes: 1

Kenan
Kenan

Reputation: 14094

Pandas can do the job

import pandas as pd
df = pd.DataFrame(task_list)

# Groupby similar keys and get len of task
df.groupby(['name', 'email']).agg({'task': list}).assign(count=lambda x: x['task'].map(lambda x: len(x))).reset_index().to_dict(orient='records')

[{'name': 'user1', 'email': '[email protected]', 'task': ['12121', '14141'], 'count': 2}, {'name': 'user2', 'email': '[email protected]', 'task': ['13131'], 'count': 1}]

Upvotes: 0

Related Questions