Reputation: 121
I use python3 to search an exact word in a sqlite3 database. When I do it with SQLite database browser it works perfectly. But when I do it with python3, it throws: sqlite3.OperationalError: user-defined function raised exception. (I will use this code for an app to search in a treeview)
Here is the code:
import sqlite3
import re
def regexp(expr, item):
reg = re.compile(expr)
return reg.search(item) is not None
conn = sqlite3.connect('/database.db')
conn.create_function("REGEXP", 2, regexp)
cursor = conn.cursor()
cursor.execute('SELECT * FROM table WHERE column REGEXP ?',['(?i)(?<=\s|^|\W)word(?=\s|$|\W)'])
data=cursor.fetchall()
print(data)
I updated my code as:
import sqlite3
import re
def regexp
def regexp(expr, item):
try:
pattern = re.compile(expr, re.I)
return re.search(pattern, str) is not None
except Exception as e:
print(e)
conn = sqlite3.connect('database.db')
conn.create_function("regexp", 2, regexp)
cursor = conn.cursor()
try:
cursor.execute(r'SELECT * FROM table WHERE column REGEXP ?',[r'(?i)(?<=\s|^|\W)word(?=\s|$|\W)'])
data=cursor.fetchall()
print(data)
except Exception as e:
print(e)
And it throws : look-behind requires fixed-width pattern
Upvotes: 2
Views: 433
Reputation: 626754
The (?i)(?<=\s|^|\W)word(?=\s|$|\W)
pattern contains a lookbehind with alternatives, two of which match a single-char pattern and one of them matches no char (^
only asserts the position at the start of string).
The problem is that Python re
does not support lookbehinds of non-fixed length. Let's try to refactor the regex:
(?i)(?<=[\s\W]|^)word(?=[\s\W]|$)
: Look how single-char matching patterns are grouped with the help of a character class. We now clearly see that [\s\W]
can be re-written as \W
because \W
matches whitespaces. So, the regex can look like(?i)(?<=\W|^)word(?=\W|$)
: Better, but it is still not what we need, the alternatives do not match the same amount of chars. However, at this point I can clearly see you want to match a word
in between non-word chars or start/end of string.So, you can use
(?i)(?<!\w)word(?!\w)
with (?<!\w)
negative lookbehind and (?!\w)
negative lookahead both containing a single \w
(no more alternation operator necessary!), which is basically
(?i)\bword\b
with mere word boundaries (in Python code, r'(?i)\bword\b'
).
However, the pattern variant with the lookarounds can be prefereble in cases the word
is a variable, when you use it like
fr'(?i)(?<!\w){re.escape(word)}(?!\w)'
As far as the code is concerned, you have a typo in the regexp
method: str
cannot be used as the input argument to re.search
. You can fix it by using
def regexp(expr, item):
try:
return re.search(expr, item, re.I) is not None
except Exception as e:
print(e)
See demo:
Upvotes: 2