Reputation: 53
Imagine that I have the following dict:
configs = {
'CONFIG1': [
{
"server": "SERVER_1",
"description": "Testing server 1.",
},
{
"server": "SERVER_2",
"description": "Testing server 2.",
}
],
'CONFIG2': [
{
"server": "SERVER_3",
"description": "Testing server 3.",
},
{
"server": "SERVER_4",
"description": "Testing server 4.",
}
],
'CONFIG3': [
]
}
I want to organize this config into a dataframe so that it is like this:
server | description | config_name |
---|---|---|
SERVER_1 | Testing server 1. | CONFIG1 |
SERVER_2 | Testing server 2. | CONFIG1 |
SERVER_3 | Testing server 3. | CONFIG2 |
SERVER_4 | Testing server 4. | CONFIG2 |
I also want to prevent empty configuration keys such as CONFIG3 from being added to the dataframe.
I've tried doing it like this:
import pandas as pd
df = pd.DataFrame()
for config in configs:
if configs[config]:
df = df.append(configs[config], ignore_index=True)
df['config_name'] = config
print(df)
But the configuration name is not right. The output is:
server | description | config_name |
---|---|---|
SERVER_1 | Testing server 1. | CONFIG2 |
SERVER_2 | Testing server 2. | CONFIG2 |
SERVER_3 | Testing server 3. | CONFIG2 |
SERVER_4 | Testing server 4. | CONFIG2 |
Upvotes: 0
Views: 45
Reputation: 10960
A one-liner would be using list comprehension
df = pd.DataFrame([{**d, 'config_name': k} for k,v in configs.items() for d in v])
Output
server description config_name
0 SERVER_1 Testing server 1. CONFIG1
1 SERVER_2 Testing server 2. CONFIG1
2 SERVER_3 Testing server 3. CONFIG2
3 SERVER_4 Testing server 4. CONFIG2
Upvotes: 0
Reputation: 780724
df['config_name'] = config
assigns this to all rows in the df, not just the rows you just added.
Add it as an entry in the dictionaries before appending to the df.
for name, dicts in configs.items():
if dicts:
for d in dicts:
d['config_name'] = name
df = df.append(dicts, ignore_index=True)
Upvotes: 0
Reputation: 323226
Let us try explode
out = pd.Series(configs).explode().dropna().apply(pd.Series)
Out[17]:
server description
CONFIG1 SERVER_1 Testing server 1.
CONFIG1 SERVER_2 Testing server 2.
CONFIG2 SERVER_3 Testing server 3.
CONFIG2 SERVER_4 Testing server 4.
Upvotes: 0
Reputation: 150735
Do not repeatedly append to a dataframe. concat
is almost always a better choice:
pd.concat([pd.DataFrame(d).assign(config_name=k)
for k,d in configs.items()
])
Output:
server description config_name
0 SERVER_1 Testing server 1. CONFIG1
1 SERVER_2 Testing server 2. CONFIG1
0 SERVER_3 Testing server 3. CONFIG2
1 SERVER_4 Testing server 4. CONFIG2
Upvotes: 2