Reputation: 885
I have many strings like this:
"[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]"
But since I'm working with a dataframe, I need to convert them into JSON (or that's what it looks like by the format) so I can access and flatten the data. Any idea on how this can be achieved?
EDIT: I realised that it's not JSON, but I still don't know how to convert this to a dictionary or so in order to manipulate it.
Upvotes: 0
Views: 59
Reputation: 783
As this could be a potentially repetitive task. It's probably a good idea to make a function out of it.
import json # Import json module to work with json data
import ast
data = "[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]"
def clean_data_for_json_loads(input_data):
"""Prepare data from untrusted sources for json formatting.
Output JSON object as string """
evaluated_data = ast.literal_eval(input_data)
json_object_as_string = json.dumps(evaluated_data)
return json_object_as_string
evaluated_data = clean_data_for_json_loads(data)
# Load json data from a string, the (s) in loads stands for string. This helps to remember the difference to json.load
json_data = json.loads(evaluated_data)
print(json_data)
Upvotes: -1
Reputation: 1228
It looks like the data is almost in JSON, but I believe in the double quotes should be around the dictionary keys, while single quotes should be around the entire object. You can fix this by running:
data_string = "[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]"
json_string = data_string.replace("'", '''"''')
You now have a JSON string!
If you need to convert the string to python structures you can do the following:
import json
data = json.loads(json_string)
print(data[0]['id']) # 10749
Upvotes: 0
Reputation: 19300
You can use ast.literal_eval
:
import ast
x = ast.literal_eval("[{'id': 10749, 'name': 'Romance'}, {'id': 35, 'name': 'Comedy'}]")
x[0]["name"] # evaluates to 'Romance'
From the documentation:
Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.
This can be used for safely evaluating strings containing Python values from untrusted sources without the need to parse the values oneself. It is not capable of evaluating arbitrarily complex expressions, for example involving operators or indexing.
Upvotes: 2