Reputation: 25
I am going to read a csv file and send to json stream format.
One of the coulmn is in array format but in pandas dataframe it will return string format.
How cany I convert "attributes_changed" column from string array to normal array format?
Sample Code:
import json
import pandas as pd
sample = {"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": "[\'metadataStatus\', \'elements\']"}
df = pd.DataFrame(data=sample, index=[0])
df_with_seq_id = [(json.dumps(rec), idx) for (idx, rec) in df.to_dict(orient='index').items()]
Current Result
[('{"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": "['metadataStatus', 'elements']"}', 0)]
Expected Result (Removed double quotes)
[('{"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": ['metadataStatus', 'elements']}', 0)]
Upvotes: 0
Views: 1551
Reputation: 5324
You can simply apply eval
on column attributes_changed
to interpret the string values as an array:
import json
import pandas as pd
sample = {
"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076",
"event_type": "UPDATED",
"attributes_changed": "['metadataStatus', 'elements']",
}
df = pd.DataFrame(data=sample, index=[0])
df["attributes_changed"] = df["attributes_changed"].apply(eval)
df_with_seq_id = [
(json.dumps(rec), idx) for (idx, rec) in df.to_dict(orient="index").items()
]
This produce the following array:
[('{"event_id": "dd9726c4-9c22-47b3-9a1b-fd6ecf494076", "event_type": "UPDATED", "attributes_changed": ["metadataStatus", "elements"]}',
0)]
Upvotes: 2