Ken Tsoi
Ken Tsoi

Reputation: 1303

Python Mongoengine: How to bypass documents which cannot pass validation to avoid crash

I am using Mongoengine to save the data in a .csv file to the MongoDB.

I define a document class as:

class Marc(Document):
    BibID = IntField(required=True)
    ISBN = StringField(required=True, max_length=13)
    Author = StringField(max_length=50)
    Title = StringField(required=True, max_length=200)
    Summary = StringField(max_length=2000)
    Genre = StringField(max_length=30)
    TopicalMain = ListField(StringField(max_length=80))
    TopicalGeographic = ListField(StringField(max_length=50))
    meta = {'allow_inheritance': True, 'strict': False}

and create the document record one by one:

def __createMarcRecord(self, rec):
    return NewDoc(
        BibID=rec['BibID'],
        ISBN=rec['ISBN'],
        Title=rec['Title'],
        Author=rec['Author'] if "Author" in rec else None,
        Summary=rec['Summary'] if "Summary" in rec else None,
        Genre=rec['Genre'] if "Genre" in rec else None,
        TopicalMain=rec['Topical_Main'] if "Topical_Main" in rec else None,
        TopicalGeographic=rec['Topical_Geographic'] if "Topical_Geographic" in rec else None,
        Stored=datetime.datetime.now()
    )

and then I try to save the document:

def storeToMongoDB(self):
    count = 0
    with open(self.errorLog, 'w') as ef:
        for i in self.bookList:
            count += 1
            print(count)
            marcRec = self.__createMarcRecord(i)
            try:
                marcRec.save()
            except:
                ef.write("{0}-{1}\n".format(count, sys.exc_info()[0]))
    ef.close()

in the .csv file there are entries which cannot pass validation (don't have the content "Title"), and so the code will crash:

Traceback (most recent call last):
  File "/.../mongo.py", line 132, in storeToMongoDB
    marcRec = self.__createMarcRecord(i)
  File "/.../mongo.py", line 112, in __createMarcRecord
    Title=rec['Title'],
KeyError: 'Title'

I want to ask if there is anyway to bypass these documents to avoid the crash?

Upvotes: 1

Views: 275

Answers (1)

Eno Gerguri
Eno Gerguri

Reputation: 687

From a first glance, it looks like you can use a try and except KeyError around line 112

try:
    marcRec = self.__createMarcRecord(i)
except KeyError:
    continue  # move onto next book

Upvotes: 1

Related Questions