user7542670
user7542670

Reputation:

Convert a dataframe to JSON via a dictionary using "to_dict()" and "json.dump()"

I'm trying to convert a dataframe to a particular JSON format. I've attempted doing this using the methods "to_dict()" and "json.dump()" from the pandas and json modules, respectively, but I can't get the JSON format I'm after. To illustrate:

df = pd.DataFrame({
    "Location": ["1ST"] * 3 + ["2ND"] * 3,
    "Date": ["2019-01", "2019-02", "2019-03"] * 2,
    "Category": ["A", "B", "C"] * 2,
    "Number": [1, 2, 3, 4, 5, 6]
})

def dataframe_to_dictionary(df, orientation):
    dictionary = df.to_dict(orient=orientation)
    return dictionary

dict_records = dataframe_to_dictionary(df, "records")

with open("./json_records.json", "w") as json_records:
    json.dump(dict_records, json_records, indent=2)

dict_index = dataframe_to_dictionary(df, "index")

with open("./json_index.json", "w") as json_index:
    json.dump(dict_index, json_index, indent=2)

When I convert "dict_records" to JSON, I get an array of the form:

[
  {
    "Location": "1ST",
    "Date": "2019-01",
    "Category": "A",
    "Number": 1
  },
  {
    "Location": "1ST",
    "Date": "2019-02",
    "Category": "B",
    "Number": 2
  },
...
]

And, when I convert "dict_index" to JSON, I get an object of the form:

{
  "0": {
    "Location": "1ST",
    "Date": "2019-01",
    "Category": "A",
    "Number": 1
  },
  "1": {
    "Location": "1ST",
    "Date": "2019-02",
    "Category": "B",
    "Number": 2
  }
...
}

But, I'm trying to get a format that looks like the following (where key = location and values = [{}]) like below. Thanks in advance for your help.

{
    1ST: [
        {
            "Date": "2019-01",
            "Category": "A",
            "Number" 1
        },
        {
            "Date": "2019-02",
            "Category": "B",
            "Number" 2
        },
        {
            "Date": "2019-03",
            "Category": "C",
            "Number" 3
        }
    ],
    2ND: [
        {},
        {},
        {}
    ]
}

Upvotes: 1

Views: 1886

Answers (1)

funnydman
funnydman

Reputation: 11336

This can be achieved via groupby:

gb = df.groupby('Location')

{k: v.drop('Location', axis=1).to_dict(orient='records') for k, v in gb}

Upvotes: 1

Related Questions