mrgou
mrgou

Reputation: 2458

Using Python Regular Expression Code in SQLite SELECT Statement

I'm trying to extract substrings from filenames stored in a table column, using a regular expression. So I'm using a custom python function:

In [1]: import sqlite3, re
In [2]: def search(expr, item):
   ...:     return re.search(expr, item).group()
   ...:
In [3]: conn = sqlite3.connect(':memory:')
In [4]: conn.create_function('SEARCH_REGEXP', 2, search)
In [5]: regexp = '[0-9]+\.[0-9]+\.(docx|pdf|rtf|doc)$'
In [6]: sql = "select SEARCH_REGEXP('%s', filename) from file_list;" % regexp
In [7]: cursor = conn.cursor()
In [8]: cursor.execute('CREATE TABLE "file_list" ("id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT UNIQUE, "filename" TE
   ...: XT);')
Out[8]: <sqlite3.Cursor at 0x1b7eafb7dc0>
In [9]: cursor.execute("INSERT INTO file_list(filename) VALUES ('filename.1.0.pdf'), ('filename.2.0.pdf');")
Out[9]: <sqlite3.Cursor at 0x1b7eafb7dc0>
In [10]: cursor.execute('select * from file_list;')
Out[10]: <sqlite3.Cursor at 0x1b7eafb7dc0>
In [11]: cursor.fetchall()
Out[11]: [(1, 'filename.1.0.pdf'), (2, 'filename.2.0.pdf')]
In [12]: cursor.execute(sql)
Out[12]: <sqlite3.Cursor at 0x1b7eafb7dc0>
In [13]: cursor.fetchall()
Out[13]: [('1.0.pdf',), ('2.0.pdf',)]

Now, I insert a value that will not match the reg exp:

In [14]: cursor.execute("INSERT INTO file_list(filename) VALUES ('filename.pdf');")
Out[14]: <sqlite3.Cursor at 0x1b7eafb7dc0>

But this will now raise an exception:

In [15]: cursor.execute(sql)
Out[15]: <sqlite3.Cursor at 0x1b7eafb7dc0>
In [16]: cursor.fetchall()
---------------------------------------------------------------------------
OperationalError                          Traceback (most recent call last)
<ipython-input-17-d35fc1caa8d6> in <module>
----> 1 cursor.fetchall()

OperationalError: user-defined function raised exception

How should I handle that, e.g. so the user-defined function returns an empty string in case of a no-match?

Thanks!

R.

Upvotes: 1

Views: 611

Answers (1)

mr.mams
mr.mams

Reputation: 462

This is because, in your search function, re.search(expr, item) returns None when there is no match. Since None is not a MatchObject, an exception will be raised.

You can only call group() if there is a match :

def search(expr, item):
   result =  re.search(expr, item)
   if result != None:
        return result.group()
   else:
        return tuple() #See note below

NOTE: You also need to think about what should be done when there is no match in order to avoid other similar issues. I'm returning an empty tuple here because group() returns a tuple, but you may want to do something else.

Upvotes: 2

Related Questions