eval vs json conversion and parsing of a dictionary like string

Question

It's been a while since my last python project so I'm a bit rusty -- feel free to offer any advise or criticism where due -- so I have a few questions regarding eval and JSON.

For this project I'm limited to Python 2.6 default library -- I'm attempting to parse the database contents of a proprietary Linux based application used for LDAP authentication. The specific command used to query the database isn't strictly important but I'm using the following method to return the contained output:

process = subprocess.Popen([cmd], shell=True, stdout=subprocess.PIPE)
stdout = process.communicate()[0]

Output:

[{'header_obj_idx': 32,
  'header_obj_state': 2,
  'header_obj_type': 48,
  'index': 2,
  'name': '',
  'obj_id': '8b14c165094d4cac81725227ce389277',
  'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com","read_only:CN=Users,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://ad.example.com","ldaps://ad.example.com:3001"],"user_to_dn_rule":"{username}@example.com","bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'},
 {'header_obj_idx': 31,
  'header_obj_state': 2,
  'header_obj_type': 48,
  'index': 1,
  'name': '',
  'obj_id': 'b0efc7a3d38a4f70abec4f73f69124de',
  'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://169.254.0.1"],"user_to_dn_rule":null,"bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'}]

** I came across a post indicating that shell=True may have certain security implications and was wondering whether there is a better solution?

Utilizing both ast.literal_eval and json.loads I've been able to successfully parse individual key pairs but I have a feeling my multi-level conversion is not necessary and believe there may be a better way?

def ldap_data(stdout):
    # evaluate object and return usable 'ldap_data' as dictionary
    _data = ast.literal_eval(stdout)[0]['ldap_data']
    return json.loads(_data)

ldap_data(stdout)['roles']

Lastly, when I started this project, it never occurred to me that the user may have multiple ldap configs depending on the individual deployment needs so, I never really considered how to go about parsing each dictionary instance. Given the number of road-blocks I've run into with this solution, I was hoping someone could help engineer a solution that takes advantage of the index found in the output above.

I apologize for asking so much, I'm sure I'm just over thinking this a bit and look forward to learning what I might do to improve. Thanks in advance for the help!

PM 2Ring · Accepted Answer

Generally, when running a single command with subprocess you don't need shell=True if you make the command name and each option a separate string, i.e

['cmd', 'arg1', 'arg2']

You do need shell=True to execute commands that are internal to the shell, or utilise other shell features, as described in the docs, but that's not an issue here.

As for parsing that data, you don't need ast.literal_eval for this, but you do need to fix the quotes to make that data valid JSON. That can be done by escaping existing double quotes, and then converting single quotes to double quotes. And once the repaired data is parsed into a Python list using json.loads you need to call json.loads again to extract the LDAP dictionaries.

import json

src = '''\
[{'header_obj_idx': 32,
  'header_obj_state': 2,
  'header_obj_type': 48,
  'index': 2,
  'name': '',
  'obj_id': '8b14c165094d4cac81725227ce389277',
  'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com","read_only:CN=Users,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://ad.example.com","ldaps://ad.example.com:3001"],"user_to_dn_rule":"{username}@example.com","bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'},
 {'header_obj_idx': 31,
  'header_obj_state': 2,
  'header_obj_type': 48,
  'index': 1,
  'name': '',
  'obj_id': 'b0efc7a3d38a4f70abec4f73f69124de',
  'ldap_data': '{"search_filter":"(sAMAccountName={username})","roles":["admin:CN=SuperUsers,DC=example,DC=com"],"search_base":"OU=Users,DC=example,DC=com","server_url":["ldap://169.254.0.1"],"user_to_dn_rule":null,"bind_dn":"CN=Bind,OU=Users,DC=example,DC=com","timeout":1500,"bind_pw":"xxxxxxxx","cache_expire":86400,"ca_cert_file":null}'}]
'''

#Escape existing double quotes, and then convert single quotes to double quotes
data = json.loads(src.replace('"', '\"').replace("'", '"'))
for d in data:
    ldap = json.loads(d['ldap_data'])
    print json.dumps(ldap, indent=4, sort_keys=True), '
'

output

{
    "bind_dn": "CN=Bind,OU=Users,DC=example,DC=com", 
    "bind_pw": "xxxxxxxx", 
    "ca_cert_file": null, 
    "cache_expire": 86400, 
    "roles": [
        "admin:CN=SuperUsers,DC=example,DC=com", 
        "read_only:CN=Users,DC=example,DC=com"
    ], 
    "search_base": "OU=Users,DC=example,DC=com", 
    "search_filter": "(sAMAccountName={username})", 
    "server_url": [
        "ldap://ad.example.com", 
        "ldaps://ad.example.com:3001"
    ], 
    "timeout": 1500, 
    "user_to_dn_rule": "{username}@example.com"
} 

{
    "bind_dn": "CN=Bind,OU=Users,DC=example,DC=com", 
    "bind_pw": "xxxxxxxx", 
    "ca_cert_file": null, 
    "cache_expire": 86400, 
    "roles": [
        "admin:CN=SuperUsers,DC=example,DC=com"
    ], 
    "search_base": "OU=Users,DC=example,DC=com", 
    "search_filter": "(sAMAccountName={username})", 
    "server_url": [
        "ldap://169.254.0.1"
    ], 
    "timeout": 1500, 
    "user_to_dn_rule": null
}

Tested on Python 2.6.6

Note that the keys (and value strings) in the ldap dictionaries are Unicode strings, and that those values represented as null in the JSON dump are actually None.

eval vs json conversion and parsing of a dictionary like string

Answers (1)

Related Questions