sivaguru
sivaguru

Reputation: 25

how to convert pandas dataframe convert string array to array

I am going to read a csv file and send to json stream format.
One of the coulmn is in array format but in pandas dataframe it will return string format.
How cany I convert "attributes_changed" column from string array to normal array format?

Sample Code:

import json
import pandas as pd

sample = {"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": "[\'metadataStatus\', \'elements\']"}
df = pd.DataFrame(data=sample, index=[0])

df_with_seq_id = [(json.dumps(rec), idx) for (idx, rec) in df.to_dict(orient='index').items()]

Current Result

[('{"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": "['metadataStatus', 'elements']"}', 0)]

Expected Result (Removed double quotes)

[('{"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": ['metadataStatus', 'elements']}', 0)]

Upvotes: 0

Views: 1551

Answers (1)

Antoine Dubuis
Antoine Dubuis

Reputation: 5324

You can simply apply eval on column attributes_changed to interpret the string values as an array:

import json
import pandas as pd

sample = {
    "event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076",
    "event_type": "UPDATED",
    "attributes_changed": "['metadataStatus', 'elements']",
}
df = pd.DataFrame(data=sample, index=[0])
df["attributes_changed"] = df["attributes_changed"].apply(eval)
df_with_seq_id = [
    (json.dumps(rec), idx) for (idx, rec) in df.to_dict(orient="index").items()
]

This produce the following array:

[('{"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": ["metadataStatus", "elements"]}',
  0)]

Upvotes: 2

Related Questions