MRose429
MRose429

Reputation: 21

JSON to python dictionary: Printing values

Noob here. I have a large number of json files, each is a series of blog posts in a different language. The key-value pairs are meta data about the posts, e.g. "{'author':'John Smith', 'translator':'Jane Doe'}. What I want to do is convert it to a python dictionary, then extract the values so that I have a list of all the authors and translators across all the posts.

for lang in languages:
   f = 'posts-' + lang + '.json'
   file = codecs.open(f, 'rt', 'utf-8')
   line = string.strip(file.next())
   postAuthor[lang] = []
   postTranslator[lang]=[]

   while (line):
      data = json.loads(line)
      print data['author']
      print data['translator']

When I tried this method, I keep getting a key error for translator and I'm not sure why. I've never worked with the json module before so I tried a more complex method to see what happened:

  postAuthor[lang].append(data['author'])
  for translator in data.keys():
      if not data.has_key('translator'):
           postTranslator[lang] = ""
      postTranslator[lang] = data['translator']

It keeps returning an error that strings do not have an append function. This seems like a simple task and I'm not sure what I'm doing wrong.

Upvotes: 2

Views: 7346

Answers (1)

jessejlt
jessejlt

Reputation: 71

See if this works for you:

import json

# you have lots of "posts", so let's assume
# you've stored them in some list. We'll use
# the example text you gave as one of the entries
# in said list

posts = ["{'author':'John Smith', 'translator':'Jane Doe'}"]

# strictly speaking, the single-quotes in your example isn't
# valid json, so you'll want to switch the single-quotes
# out to double-quotes, you can verify this with something
# like http://jsonlint.com/
# luckily, you can easily swap out all the quotes programmatically

# so let's loop through the posts, and store the authors and translators
# in two lists
authors = []
translators = []

for post in posts:
    double_quotes_post = post.replace("'", '"')
    json_data = json.loads(double_quotes_post)

    author = json_data.get('author', None)
    translator = json_data.get('translator', None)

    if author: authors.append(author)
    if translator: translators.append(translator)

# and there you have it, a list of authors and translators

Upvotes: 2

Related Questions