kevin
kevin

Reputation: 309

Python Pandas to CSV using custom column as index

I used the code below to create a Pandas data frame and output to JSON:

dataDict['grant_id'] = list(grant_ids)
dataDict['patent_title'] = list(patent_title)
dfjson = dfjson.transpose()
dfjson = pd.DataFrame.from_dict(dataDict, orient='index')
output_json =dfjson.to_json('output_json.json')

{"0":
{"grant_id":"US10357643",
"patent_title":"System for enhanced sealing of coupled medical flui,
 .... }}

However, I do not want 0 to be included in the JSON. I want it to look like this:

{"US10357643":{
"patent_title":"System for enhanced sealing of coupled medical flui,
 .... }}

I want to use grant_id as the index, rather than 0, 1, 2, 3, ... Is there a way to fix this?

Upvotes: 2

Views: 255

Answers (1)

jezrael
jezrael

Reputation: 863166

I believe you need DataFrame.set_index with double transpose:

dfjson = pd.DataFrame({"0":{"grant_id":"US10357643",
                            "patent_title":"System1"},
    "1":{"grant_id":"US10357644",
                            "patent_title":"System2"}})
print (dfjson)
                       0           1
grant_id      US10357643  US10357644
patent_title     System1     System2

output_json = dfjson.T.set_index('grant_id').T.to_json()
print (output_json)
{"US10357643":{"patent_title":"System1"},"US10357644":{"patent_title":"System2"}}

dfjson.T.set_index('grant_id').T.to_json('output_json.json')

So it seems need create index before transpose:

dataDict['grant_id'] = list(grant_ids)
dataDict['patent_title'] = list(patent_title)

dfjson = pd.DataFrame.from_dict(dataDict, orient='index')
dfjson = dfjson.set_index('grant_id').transpose()

dfjson.to_json('output_json.json')

Upvotes: 1

Related Questions