merlin
merlin

Reputation: 2917

How to match different writings of a key name in python?

I need to match keys against various possible writings of a key name and return the corresponding value.

How is it possible to improve the function I wrote, in a way that only one line is nessessary per key, e.g. (10 bar|10 ATM|100m|100 m) ?

def water(i): 
    switcher={
            'bis 10 bar' : 127,      
            '10 bar' : 127, 
            '10 ATM' : 127,       
            '100m' : 127,      
            '100 m' : 127,      
            '300m' : 129,      
            '300 m' : 129,      
            'bis 30 bar' : 129,      
            '30 bar' : 129,      
         }
    for k, v in switcher.items():
        if k.lower() in i.lower():
            return v
    return "Invalid: " + i

print water('10 ATM');

The function will return the value for each key if present, if not it will return invalid: +key.

So in the case of print water('10 ATM'); it will return 127

I am looking for a way to match different writing styles of the key.

Upvotes: 1

Views: 86

Answers (6)

miraculixx
miraculixx

Reputation: 10359

a way that only one line is nessessary per key (...) looking for a way to match different writing styles of the key

Use a custom dictionary

Here's an approach that works like a dictionary and is easy to reuse for your other scenarios (e.g. color).

# definition, replacing the water function
water = FuzzyDict({
    'bis 10 bar|10.?bar|10 ATM|100.?m' : 127,
    '300.?m|bis 30 bar|30 bar': 129
})

This way you can use water as you would a normal dictionary, you don't even need a function anymore, and you can easily use water to test if an item is stored, e.g. using if <query> in water:

# example use
water['bis 10 bar'], water['10 bar'], water['10 ATM'], water['300m'], water['300 m']
=> (127, 127, 127, 129, 129)
'bis 10 bar' in water, 'red color' in water
=> (True, False)

One-time implementation of FuzzyDict

The FuzzyDict is implemented as follows. Put this code in a separate module fuzzydict.py, then forget about it -- your actual code looks like the demo above.

How does this work? FuzzyDict overrides the __getitem__ method of a normal dictionary to parse the regular expressions provided as keys. To make this work efficiently, the patterns are compiled on definition of entries and the first match among all patterns is used.

import re 

class FuzzyDict(dict):
    def __init__(self, *args, **kwargs):
        self._patterns = {}
        if len(args):
            for data in args:
                for k, v in data.iteritems():
                    self[k] = v
        super(FuzzyDict, self).__init__(*args, **kwargs)

    def __getitem__(self, k):
        for pk, p in self._patterns.iteritems():
            if p.match(k):
                k = pk
                break
        return super(FuzzyDict, self).__getitem__(k)

    def __setitem__(self, k, v):
        if not k in self._patterns:
            p = re.compile(k)
            self._patterns[k] = p 
        super(FuzzyDict, self).__setitem__(k, v)

    def __contains__(self, k):
        try:
            self[k]
        except:
            return False
        else:
            return True

Notes

  • the class implementation is not fully complete but matches the purpose as per the question.
  • the _patterns helper dict matches patterns to actual dict keys. This way FuzzyDict.keys() returns the keys as specified, not the compiled patterns (as is the case in @Omar Sabade's solution). This makes it easier to debug/work with the keys outside of __getitem__.

Upvotes: 0

Serge Ballesta
Serge Ballesta

Reputation: 149075

If a specific substring (or pattern) can be found in each group and if you can accept to go as soon as the substring is present you can try to search for it. For example here you could do

if '10' in i:
    return 127
elif '30' in i:
    return i
else:
    return "Invalid: " + i

But it would accept keys other that those from your lists...

Upvotes: 0

yuvgin
yuvgin

Reputation: 1362

Not terribly elegant, but you could try:

def water(i): 
    if i in ['bis 10 bar', '10 bar', '10 ATM', '100m', '100 m']:
        return 127
    elif i in ['300m', '300 m, 'bis 30 bar', '30 bar']:
        return 129
    else:
        return "Invalid: " + i

This could also be further generalized to take into consideration lower/upper case and spaces:

def water(i): 
    j = i.replace(" ", "").lower()
    if j in ['bis10bar', '10bar', '10atm', '100m']:
        return 127
    elif j in ['bis30bar', '30bar', '30atm', '300m']:
        return 129
    else:
        return "Invalid: " + i

However note the latter will allow for any variation of spacing. If you wish to restrict this to only specific cases, you can remove the .replace(" ", "") from the second example and make the cases in the lists more specific.

Upvotes: 1

Omkar Sabade
Omkar Sabade

Reputation: 783

Use re to specify patterns. This will work for your example

import re
switcher = {
    re.compile('.*10.*'):127,
    re.compile('.*30.*'):129
}

def water(string):
    for i in switcher.keys():
        if re.match(i,string):
            return switcher[i]
    return "Invalid"

You could very well group the different patterns into a single list and do a check on the list instead. But re will give you better pattern matching if that's what you want.

Upvotes: 3

gosuto
gosuto

Reputation: 5741

I created a new dictionary with aliases to the original switcher keys:

def water(i):
    switcher = {
        10: 127,
        30: 129
    }
    aliases = {
        10: ['bis 10 bar', '10 bar'],
        30: ['300m', '30 bar']
    }
    for k, v in aliases.items():
        if i.lower() in v:
            return switcher[k]
    return 'invalid: ' + i

Upvotes: 0

kiner_shah
kiner_shah

Reputation: 4641

def water(i): 
    switcher={
            'bis 10 bar' : 127,      
            '10 bar' : 127, 
            '10 ATM' : 127,       
            '100m' : 127,      
            '100 m' : 127,      
            '300m' : 129,      
            '300 m' : 129,      
            'bis 30 bar' : 129,      
            '30 bar' : 129,      
         }
    if i in switcher:
        return switcher[i]
    elif i.lower() in switcher:
        return switcher[i.lower()]
    return "Invalid: " + i

print water('10 ATM')

I replaced the for loop with if-else statement.

Upvotes: 0

Related Questions