Reputation: 1891
Let's assume I have this code below which uses the flatten function to flatten a JSON object and convert it into a Pandas data frame.
from flatten_json import flatten
import pandas as pd
data = [{
"_id" : 1,
"IdList" : [
6422
],
"levels" : [
{
"active" : "true",
"level" : 3,
"actions" : [
{
"isActive" : "true"
}]
}]
}
,
{
"_id" : 2,
"IdList" : [
6442
],
"levels" : [
{
"active" : "true",
"level" : 1,
"actions" : [
{
"isActive" : "true"
}]
}]
}
]
dic_flattened = [flatten(i) for i in data]
result = pd.DataFrame(dic_flattened)
result.columns = result.columns.str.replace("_0_", ".")
print(result)
Current Output:
_id IdList_0 levels.active levels.level levels.actions.isActive
0 1 6422 true 3 true
1 2 6442 true 1 true
The function uses a _
as a separator, however, I do not need the underscore along with the integers in the column name so I have added a replace which gets rid of them. My only issue is when I try to replace the _0
from the IdList_0
column I get the following error:
AttributeError: 'Index' object has no attribute 'replace'
Is there any way for me to get rid of that _0
from the column name so the desired output can be like the following:
Desired Output:
_id IdList levels.active levels.level levels.actions.isActive
0 1 6422 true 3 true
1 2 6442 true 1 true
Upvotes: 2
Views: 6140
Reputation: 13518
Try:
result.columns = ["_id", "IdList", "levels.active", "levels.level", "levels.actions.isActive"]
Or a "dynamic" version as per your second question:
result.columns = [col.replace("_0", "") for col in result.columns]
Upvotes: 4