Python Combining f-string with r-string and curly braces in regex

Question

Given a single word (x); return the possible n-grams that can be found in that word. You can modify the n-gram value according as you want; it is in the curly braces in the pat variable. The default n-gram value is 4.

For example; for the word (x): x = 'abcdef' The possible 4-gram are:

['abcd', 'bcde', 'cdef']

def ngram_finder(x):
    pat = r'(?=(\S{4}))'
    xx = re.findall(pat, x)
    return xx

The Question is: How to combine the f-string with the r-string in the regex expression, using curly braces.

Nick · Accepted Answer

You can use this string to combine the n value into your regexp, using double curly brackets to create a single one in the output:

fr'(?=(\S{{{n}}}))'

The regex needs to have {} to make a quantifier (as you had in your original regex {4}). However f strings use {} to indicate an expression replacement so you need to "escape" the {} required by the regex in the f string. That is done by using {{ and }} which in the output create { and }. So {{{n}}} (where n=4) generates '{' + '4' + '}' = '{4}' as required.

Complete code:

import re

def ngram_finder(x, n):
    pat = fr'(?=(\S{{{n}}}))'
    return re.findall(pat, x)
    
x = 'abcdef'
print(ngram_finder(x, 4))
print(ngram_finder(x, 5))

Output:

['abcd', 'bcde', 'cdef']
['abcde', 'bcdef']

Python Combining f-string with r-string and curly braces in regex

Answers (1)

Related Questions