Daniel Borysowski
Daniel Borysowski

Reputation: 133

Using ordered dict tuples as find&replace patterns

I want to use a dictionary as a find&replace patterns set to be applied to another dictionary with keys and values as strings.

I have two dictionaries. First one is an ordered dictionary (collections.OrderedDict) with tuples that are the above mentioned find&replace patterns.

A fragment of this dictionary looks like this:

dict1 = 
{
('e0', 'i0'),
('o0', 'a0'),
('t sj a0$', 'ts a0'),
('tj sj a0$', 'ts a'),
('([bvgdzklmnprstfh])j a0', '\\1j i0'),
('([^s])j a0$', '\\1j i0')
}

As you can see, some of these patterns are simply strings, some of them contain RegEx special characters. This dict has to be ordered, because many of its patterns have to be applied in a specific order. Standard dict, as far as I know, is doing it "randomishly".

Second one looks like this:

dict2 =
{
'обнёсшим': 'o0 b nj o1 s sh i0 m',
'колыхалось': 'k o0 l y0 h a1 l o0 sj',
'непроизводительностях': 'nj e0 p r o0 i0 z v o0 dj i1 tj e0 lj n o0 s tj a0 h',
'цукаемою': 'ts u0 k a1 j e0 m o0 j u0',
'соревнующееся': 's o0 rj e0 v n u1 j u0 sch e0 j e0 sj a0',
'сорганизовано': 's o0 r g a0 nj i0 z o1 v a0 n o0'
}

My goal is to iterate over first dict (dict1) and check, if any of find patterns (first element of each tuple) is found in the second's dictionary (dict2) values. If yes, I want each of them to be replaced with replace patterns (second element of each tuple).

I have this script which almost does the job. It works, if I'm not using RegEx special characters. It doesn't work for any of $, [], [^], \1 and many others (which is quite strange, because I've tried my patterns for some strings in Python3 console).

for find, replace in dict1.items():
    for g, p in dict2.items():
        if find in p:
            dict2[g] = re.sub(find, replace, dict2[g])

Expected result is to get these RegEx patterns work.

Upvotes: 1

Views: 104

Answers (1)

Khalid Ali
Khalid Ali

Reputation: 1224

The issue with your code in this line if find in p:.

When you use RegEx special syntax such as a0$ the if statement will never be True, you could instead use regex compile/search for this check instead of the in membership test operation, or remove the if statement all together.

Upvotes: 1

Related Questions