Reputation: 797
It's been a while since my last python project so I'm a bit rusty -- feel free to offer any advise or criticism where due -- so I have a few questions regarding eval and JSON.
For this project I'm limited to Python 2.6 default library -- I'm attempting to parse the database contents of a proprietary Linux based application used for LDAP authentication. The specific command used to query the database isn't strictly important but I'm using the following method to return the contained output:
process = subprocess.Popen([cmd], shell=True, stdout=subprocess.PIPE)
stdout = process.communicate()[0]
Output:
[{'header_obj_idx': 32,
'header_obj_state': 2,
'header_obj_type': 48,
'index': 2,
'name': '',
'obj_id': '8b14c165094d4cac81725227ce389277',
'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com","read_only:CN=Users,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://ad.example.com","ldaps://ad.example.com:3001"],"user_to_dn_rule":"{username}@example.com","bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'},
{'header_obj_idx': 31,
'header_obj_state': 2,
'header_obj_type': 48,
'index': 1,
'name': '',
'obj_id': 'b0efc7a3d38a4f70abec4f73f69124de',
'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://169.254.0.1"],"user_to_dn_rule":null,"bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'}]
** I came across a post indicating that shell=True
may have certain security implications and was wondering whether there is a better solution?
Utilizing both ast.literal_eval
and json.loads
I've been able to successfully parse individual key pairs but I have a feeling my multi-level conversion is not necessary and believe there may be a better way?
def ldap_data(stdout):
# evaluate object and return usable 'ldap_data' as dictionary
_data = ast.literal_eval(stdout)[0]['ldap_data']
return json.loads(_data)
ldap_data(stdout)['roles']
Lastly, when I started this project, it never occurred to me that the user may have multiple ldap configs depending on the individual deployment needs so, I never really considered how to go about parsing each dictionary instance. Given the number of road-blocks I've run into with this solution, I was hoping someone could help engineer a solution that takes advantage of the index found in the output above.
I apologize for asking so much, I'm sure I'm just over thinking this a bit and look forward to learning what I might do to improve. Thanks in advance for the help!
Upvotes: 2
Views: 445
Reputation: 55469
Generally, when running a single command with subprocess
you don't need shell=True
if you make the command name and each option a separate string, i.e
['cmd', 'arg1', 'arg2']
You do need shell=True
to execute commands that are internal to the shell, or utilise other shell features, as described in the docs, but that's not an issue here.
As for parsing that data, you don't need ast.literal_eval
for this, but you do need to fix the quotes to make that data valid JSON. That can be done by escaping existing double quotes, and then converting single quotes to double quotes. And once the repaired data is parsed into a Python list using json.loads
you need to call json.loads
again to extract the LDAP dictionaries.
import json
src = '''\
[{'header_obj_idx': 32,
'header_obj_state': 2,
'header_obj_type': 48,
'index': 2,
'name': '',
'obj_id': '8b14c165094d4cac81725227ce389277',
'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com","read_only:CN=Users,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://ad.example.com","ldaps://ad.example.com:3001"],"user_to_dn_rule":"{username}@example.com","bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'},
{'header_obj_idx': 31,
'header_obj_state': 2,
'header_obj_type': 48,
'index': 1,
'name': '',
'obj_id': 'b0efc7a3d38a4f70abec4f73f69124de',
'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://169.254.0.1"],"user_to_dn_rule":null,"bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'}]
'''
#Escape existing double quotes, and then convert single quotes to double quotes
data = json.loads(src.replace('"', '\\"').replace("'", '"'))
for d in data:
ldap = json.loads(d['ldap_data'])
print json.dumps(ldap, indent=4, sort_keys=True), '\n'
output
{
"bind_dn": "CN=Bind,OU=Users,DC=example,DC=com",
"bind_pw": "xxxxxxxx",
"ca_cert_file": null,
"cache_expire": 86400,
"roles": [
"admin:CN=SuperUsers,DC=example,DC=com",
"read_only:CN=Users,DC=example,DC=com"
],
"search_base": "OU=Users,DC=example,DC=com",
"search_filter": "(sAMAccountName={username})",
"server_url": [
"ldap://ad.example.com",
"ldaps://ad.example.com:3001"
],
"timeout": 1500,
"user_to_dn_rule": "{username}@example.com"
}
{
"bind_dn": "CN=Bind,OU=Users,DC=example,DC=com",
"bind_pw": "xxxxxxxx",
"ca_cert_file": null,
"cache_expire": 86400,
"roles": [
"admin:CN=SuperUsers,DC=example,DC=com"
],
"search_base": "OU=Users,DC=example,DC=com",
"search_filter": "(sAMAccountName={username})",
"server_url": [
"ldap://169.254.0.1"
],
"timeout": 1500,
"user_to_dn_rule": null
}
Tested on Python 2.6.6
Note that the keys (and value strings) in the ldap
dictionaries are Unicode strings, and that those values represented as null
in the JSON dump are actually None
.
Upvotes: 2