nofar mishraki
nofar mishraki

Reputation: 618

pandas change the order of columns

In my project I'm using flask I get a JSON (by REST API) that has data that I should convert to a pandas Dataframe. The JSON looks like:

{
    "entity_data":[
                  {"id": 1, "store": "a", "marker": "a"}
    ]
}

I get the JSON and extract the data:

params = request.json
entity_data = params.pop('entity_data')

and then I convert the data into a pandas dataframe:

entity_ids = pd.DataFrame(entity_data)

the result looks like this:

   id marker store
0   1      a     a

This is not the original order of the columns. I'd like to change the order of the columns as in the dictionary. help?

Upvotes: 3

Views: 2294

Answers (3)

Adithya
Adithya

Reputation: 1728

Assuming you have access to JSON sender, you can send the order in the JSON itself.

like

`{
"order":['id','store','marker'],
"entity_data":{"id": [1,2], "store": ["a","b"],
"marker": ["a","b"]}
}

then create DataFrame with columns specified. as said by Chiheb.K.

import pandas as pd
params = request.json
entity_data = params.pop('entity_data')
order = params.pop('order')
entity_df=pd.DataFrame(data,columns=order)

if you cannot explicitly specify the order in the JSON. see this answer to specify object_pairs_hook in JSONDecoder to get an OrderedDict and then create the DataFrame

Upvotes: 1

jpp
jpp

Reputation: 164773

Use OrderedDict for an ordered dictionary

You should not assume dictionaries are ordered. While dictionaries are insertion ordered in Python 3.7, whether or not libraries maintain this order when reading json into a dictionary, or converting the dictionary to a Pandas dataframe, should not be assumed.

The most reliable solution is to use collections.OrderedDict from the standard library:

import json
import pandas as pd
from collections import OrderedDict

params = """{
    "entity_data":[
                  {"id": 1, "store": "a", "marker": "a"}
    ]
}"""

# replace myjson with request.json
data = json.loads(params, object_pairs_hook=OrderedDict)
entity_data = data.pop('entity_data')

df = pd.DataFrame(entity_data)

print(df)

#    id store marker
# 0   1     a      a

Upvotes: 2

Chiheb.K
Chiheb.K

Reputation: 156

Just add the column names parameter.

entity_ids = pd.DataFrame(entity_data, columns=["id","store","marker"])

Upvotes: 2

Related Questions