johowie
johowie

Reputation: 2495

add an array of linked document _ids to couchdb documents in python

I want to add a links property to each couchdb document based on data in a csv file. the value of the links property is to be an array of dicts containing the couchdb _id of the linked document and the linkType

When I run the script i get a links error (see error info below) I am not sure how to create the dict key links if it doesn't exist and add the link data, or otherwise append to the links array if it does exist.

an example of a document with the links will look like this:

{
    _id: p_3,
    name: 'Smurfette'
    links: [
                {to_id: p_2, linkType: 'knows'},
                {to_id: o_56, linkType: 'follows'}
           ]
}

python script for processing the csv file:

#!/usr/bin/python
# coding: utf-8

# Version 1
# 
# csv fields: ID,fromType,fromID,toType,toID,LinkType,Directional


import csv, sys, couchdb


def csv2couchLinks(database, csvfile):

    # CouchDB Database Connection etc
    server = couchdb.Server()
    #assumes that couchdb runs on http://localhost:5984
    db = server[database]
    #assumes that db is already created

    # CSV file
    data = csv.reader(open(csvfile, "rb")) # Read in the CSV file rb=read/binary
    csv_links= csv.DictReader(open(csvfile, "rb"))


    def makeLink(from_id, to_id, linkType):
        # get doc from db
        doc = db[from_id]

        # construct link object
        link = {'to_id':to_id, 'linkType':linkType}

        # add link reference to array at key 'links'
        if doc['links'] in doc:
            doc['links'].append(link)
        else:
            doc['links'] = [link]

        # update the record in the database
        db[doc.id] = doc


    # read each row in csv file
    for row in csv_links:

        # get entityTypes as lowercase and entityIDs
        fromType = row['fromType'].lower()
        fromID   = row['fromID']
        toType   = row['toType'].lower()
        toID     = row['toID']

        linkType = row['LinkType']

        # concatenate 'entity type' and 'id' to make couch '_id'
        fromIDcouch = fromType[0]+'_'+fromID #eg 'p_2' <= person 2
        toIDcouch = toType[0]+'_'+toID

        makeLink(fromIDcouch, toIDcouch, linkType)
        makeLink(toIDcouch, fromIDcouch, linkType)


# Run csv2couchLinks() if this is not an imported module
if __name__ == '__main__':
    DATABASE = sys.argv[1]
    CSVFILE = sys.argv[2]
    csv2couchLinks(DATABASE,CSVFILE)   

error info:

$ python LINKS_csv2couchdb_v1.py "qmhonour" "./tablesAsCsv/links.csv"
Traceback (most recent call last):
  File "LINKS_csv2couchdb_v1.py", line 65, in <module>
    csv2couchLinks(DATABASE,CSVFILE)   
  File "LINKS_csv2couchdb_v1.py", line 57, in csv2couchLinks
    makeLink(fromIDcouch, toIDcouch, linkType)
  File "LINKS_csv2couchdb_v1.py", line 33, in makeLink
    if doc['links'] in doc:
KeyError: 'links'

Upvotes: 0

Views: 655

Answers (2)

RocketDonkey
RocketDonkey

Reputation: 37279

Another option is condensing the if block to this:

doc.setdefault('links', []).append(link)

The dictionary's setdefault method checks to see if links exists in the dictionary, and if it doesn't, it creates a key and makes the value an empty list (the default). It then appends link to that list. If links does exist, it just appends link to the list.

def makeLink(from_id, to_id, linkType):
    # get doc from db
    doc = db[from_id]

    # construct link object
    link = {'to_id':to_id, 'linkType':linkType}

    # add link reference to array at key 'links'
    doc.setdefault('links', []).append(link)

    # update the record in the database
    db[doc.id] = doc

Upvotes: 2

Nathan Villaescusa
Nathan Villaescusa

Reputation: 17659

Replace:

if doc['links'] in doc: 

With:

if 'links' in doc:

Upvotes: 1

Related Questions