Enzo Matheus
Enzo Matheus

Reputation: 53

Add column during loop with specific value

Imagine that I have the following dict:

 configs = {
    'CONFIG1': [
        {
            "server": "SERVER_1",
            "description": "Testing server 1.",
        },
        {
            "server": "SERVER_2",
            "description": "Testing server 2.",
        }
    ],
    'CONFIG2': [
        {
            "server": "SERVER_3",
            "description": "Testing server 3.",
        },
        {
            "server": "SERVER_4",
            "description": "Testing server 4.",
        }
    ],
    'CONFIG3': [
        
    ]
}

I want to organize this config into a dataframe so that it is like this:

server description config_name
SERVER_1 Testing server 1. CONFIG1
SERVER_2 Testing server 2. CONFIG1
SERVER_3 Testing server 3. CONFIG2
SERVER_4 Testing server 4. CONFIG2

I also want to prevent empty configuration keys such as CONFIG3 from being added to the dataframe.

I've tried doing it like this:

import pandas as pd

df = pd.DataFrame()

for config in configs:
    if configs[config]:
        df = df.append(configs[config], ignore_index=True)
        df['config_name'] = config
    

print(df)

But the configuration name is not right. The output is:

server description config_name
SERVER_1 Testing server 1. CONFIG2
SERVER_2 Testing server 2. CONFIG2
SERVER_3 Testing server 3. CONFIG2
SERVER_4 Testing server 4. CONFIG2

Upvotes: 0

Views: 45

Answers (4)

Vishnudev Krishnadas
Vishnudev Krishnadas

Reputation: 10960

A one-liner would be using list comprehension

df = pd.DataFrame([{**d, 'config_name': k} for k,v in configs.items() for d in v])

Output

     server        description config_name
0  SERVER_1  Testing server 1.     CONFIG1
1  SERVER_2  Testing server 2.     CONFIG1
2  SERVER_3  Testing server 3.     CONFIG2
3  SERVER_4  Testing server 4.     CONFIG2

Upvotes: 0

Barmar
Barmar

Reputation: 780724

df['config_name'] = config assigns this to all rows in the df, not just the rows you just added.

Add it as an entry in the dictionaries before appending to the df.

for name, dicts in configs.items():
    if dicts:
        for d in dicts:
            d['config_name'] = name
        df = df.append(dicts, ignore_index=True)

Upvotes: 0

BENY
BENY

Reputation: 323226

Let us try explode

out = pd.Series(configs).explode().dropna().apply(pd.Series)
Out[17]: 
           server        description
CONFIG1  SERVER_1  Testing server 1.
CONFIG1  SERVER_2  Testing server 2.
CONFIG2  SERVER_3  Testing server 3.
CONFIG2  SERVER_4  Testing server 4.

Upvotes: 0

Quang Hoang
Quang Hoang

Reputation: 150735

Do not repeatedly append to a dataframe. concat is almost always a better choice:

pd.concat([pd.DataFrame(d).assign(config_name=k) 
           for k,d in configs.items()
          ])

Output:

     server        description config_name
0  SERVER_1  Testing server 1.     CONFIG1
1  SERVER_2  Testing server 2.     CONFIG1
0  SERVER_3  Testing server 3.     CONFIG2
1  SERVER_4  Testing server 4.     CONFIG2

Upvotes: 2

Related Questions