Reputation: 3931
I want to clean a dictionary
that is from a json
object to remove all the \n
and |
characters so that I can use the csv
DictWriter
to write it out as a line in a flat-file for a copy into an AWS Database. I've never used recursion on a dict
object before, and I'm struggling to figure out how to effectively move through all levels until they are a single string, and then iterate through a list of items that I want to replace. With my code I'm currently receiving an IndexError
saying my string index is out of range. Here is my function:
def purge_items(in_iter, items):
if isinstance(in_iter, dict):
for k, v in in_iter:
if isinstance(v, dict):
purge_items(k[v], items)
elif isinstance(in_iter, list):
for item in items:
for elem in in_iter:
try:
elem.replace(item[0], item[1])
except AttributeError:
continue
else:
try:
for item in items:
in_iter.replace(item[0], item[1])
except AttributeError:
return
This function is expecting a dictionary (after I figure it out with a dictionary I want to make it more general to accept any mutable) with arbitrary nested length, and then a list of the items you want to replace in the following form ('\n', ' '), where the second entry is what you are replacing it with.
An example of the data I'm working with is below, with newlines included:
{'issuetype': {'avatarId': 22101,
'description': 'A problem found in '
'production which impairs '
'or prevents the '
'functions of the '
'product.',
'iconUrl': 'https://instructure.atlassian.net/secure/viewavatar?size=xsmall&avatarId=22101&avatarType=issuetype',
'id': '1',
'name': 'Bug',
'self': 'https://instructure.atlassian.net/rest/api/2/issuetype/1',
'subtask': False}}
Upvotes: 0
Views: 84
Reputation: 710
Ok, there are plenty of modules in general handling and playing with text, to mention only a few:
ast.literal_eval()
textwrap.dedent()
but in Your case simple:
test = """
{'issuetype': {'avatarId': 22101,
'description': 'A problem found in '
'production which impairs '
'or prevents the '
'functions of the '
'product.',
'iconUrl': 'https://instructure.atlassian.net/secure/viewavatar?size=xsmall&avatarId=22101&avatarType=issuetype',
'id': '1',
'name': 'Bug',
'self': 'https://instructure.atlassian.net/rest/api/2/issuetype/1',
'subtask': False}
}
"""
print ("".join([obj.strip().replace('|', '') for obj in test.split("\n")]))
output
{'issuetype': {'avatarId': 22101,'description': 'A problem found in ''production which impairs ''or prevents the ''functions of the ''product.','iconUrl': 'https://instructure.atlassian.net/secure/viewavatar?size=xsmall&avatarId=22101&avatarType=issuetype','id': '1','name': 'Bug','self': 'https://instructure.atlassian.net/rest/api/2/issuetype/1','subtask': False}}
should suffice, does it?
Ooops, not quite, double " ' ' " needs to be removed too - corrected version:
test_1 = "".join([obj.strip().replace('|', '')
for obj in test.split("\n")])
test_2 = test_1.replace("''", "")
print (test_2)
output
{'issuetype': {'avatarId': 22101,'description': 'A problem found in production which impairs or prevents the functions of the product.','iconUrl': 'https://instructure.atlassian.net/secure/viewavatar?size=xsmall&avatarId=22101&avatarType=issuetype','id': '1','name': 'Bug','self': 'https://instructure.atlassian.net/rest/api/2/issuetype/1','subtask': False}}
Upvotes: 1