user977828
user977828

Reputation: 7679

Create a JSON object in python

I am trying to create a JSON object and appending it to a list but with no success. I got this error massage with:

Traceback (most recent call last):
  File "/projects/circos/test.py", line 32, in <module>
    read_relationship('data/chr03_small_n10.blast')
  File "/projects/circos/test.py", line 20, in read_relationship
    tmp = ("[source: {id: '{}',start: {},end: {}},target: {id: '{}',start: {}, end: {}}],").format(parts[0],parts[2],parts[3],parts[1],parts[4],parts[5])
KeyError: 'id'

with the following code

def read_relationship(filename):
    data = []
    with open(filename) as f:
        f.next()
        for line in f:
            try:
                parts = line.rstrip().split('\t')
                query_name = parts[0]
                subject_name = parts[1]
                query_start = parts[2]
                query_end = parts[3]
                subject_start = parts[4]
                subject_end = parts[5]


                # I need: [source: {id: 'Locus_1', start: 1, end: 1054}, target: {id: 'tig00007234', start: 140511, end: 137383}],
                tmp = ("[source: {id: '{}',start: {},end: {}},target: {id: '{}',start: {}, end: {}}],").format(parts[0],parts[2],parts[3],parts[1],parts[4],parts[5])
                data.append(tmp)

            except ValueError:
                pass

    with open('data/data.txt', 'w') as outfile:
        json.dump(data, outfile)


read_relationship('data/chr03_small_n10.blast')

What did I miss?

Upvotes: 0

Views: 13359

Answers (2)

Elis Byberi
Elis Byberi

Reputation: 1452

You are using json.dump() function wrong.

You pass an object and a file object:

json.dump(object, fileobject)

Use dict for key value mapping:

def read_relationship(filename):
    data = []
    with open(filename) as f:
        f.next()
        for line in f:
            try:
                parts = line.rstrip().split('\t')
                query_name = parts[0]
                subject_name = parts[1]
                query_start = parts[2]
                query_end = parts[3]
                subject_start = parts[4]
                subject_end = parts[5]

                # use dict here
                item = {
                    'source': {
                        'id': query_name,
                        'start': subject_name,
                        'end': query_start
                },
                    'target': {
                        'id': query_end,
                        'start': subject_start,
                        'end': subject_end
                    }
                }
                data.append(item)

            except ValueError:
                pass

    with open('data/data.txt', 'w') as outfile:
        json.dump(data, outfile)


read_relationship('data/chr03_small_n10.blast')

Upvotes: 5

Martijn Pieters
Martijn Pieters

Reputation: 1121366

You need to double the { and } characters that are not placeholders; {id:...} is seen as a named placeholder otherwise:

tmp = (
    "[source: {{id: '{}',start: {},end: {}}},"
    "target: {{id: '{}',start: {}, end: {}}}],").format(
        parts[0], parts[2], parts[3], parts[1], parts[4], parts[5])

The {{ and }} sequences end up as single { and } characters in the result.

Rather than put all your parts in separately, use numbered slots:

tmp = (
    "[source: {{id: '{0}',start: {2},end: {3}}},"
    "target: {{id: '{1}',start: {4}, end: {5}}}],").format(
        *parts)

You should consider using the csv module to read your TSV data, and if you meant for the above data to be part of the JSON document (not as embedded string but as separate JSON arrays and objects), then formatting it as a string won't work.

You'll need to convert your CSV columns to integers first though:

import csv
import json

def read_relationship(filename):
    data = []
    with open(filename, 'rb') as f:
        reader = csv.reader(f, delimiter='\t')
        next(reader, None)
        for row in reader:
            data.append([{
                'source': {
                    'id': row[0],
                    'start': int(row[2]),
                    'end': int(row[3]),
                },
                'target': {
                    'id': row[1],
                    'start': int(row[4]),
                    'end': int(row[5]),
                },
            }])

    with open('data/data.txt', 'w') as outfile:
        json.dump(data, outfile)

Upvotes: 2

Related Questions