merlin
merlin

Reputation: 2917

matching key against different pairs in python

Due to different names of an attribute I need to match a key of a key value pare against a regex.

The possible names are defined in a dict:

MyAttr  = [
    ('ref_nr', 'Reference|Referenz|Referenz-Nr|Referenznummer'),
    ('color', 'Color|color|tinta|farbe|Farbe'),
]

The import attributes from an item in another dict:

ImportAttr  = [
    ('Referenz', 'Ref-Val'),
    ('color', 'red'),
]

Now I would like to return the value of the import attributes, if it is a known attribute (defined in my first dict MyAttr) matching different spelling of the attribute in question.

for key, value in ImportAttr:
    if key == "Referenz-Nr" : ref      = value
    if key == "Farbe"       : color    = value

The goal is to return the value of a possible attribute if it is a known one.

print(ref)
print(color)

Should return the value if "Referenz-Nr" and "Farbe" are known attributes.

Obviously this pseudo code does not work, I just can't get my head around a function implementing regex for a key search.

Upvotes: 2

Views: 72

Answers (2)

user6035995
user6035995

Reputation:

It was not clear for me but maybe you want it:

#!/usr/bin/python3

MyAttr  = [
    ('ref_nr', 'Reference|Referenz|Referenz-Nr|Referenznummer'),
    ('color', 'Color|color|tinta|farbe|Farbe')
]

ImportAttr  = [
    ('Referenz', 'Ref-Val'),
    ('color', 'red'),
]

ref, color = None, None

for key, value in ImportAttr:
    if key in MyAttr[0][1].split('|'): 
        ref = value
    if key in MyAttr[1][1].split('|'): 
        color = value

print("ref: ", ref)
print("color: ", color)

The split can split the string into a list of string by the separator ("|" character here) then you can check is the key in that list or not.

The following solution is a little bit tricky. If you don't want to hardcode the positions into your source you can use locals().

#!/usr/bin/python3

MyAttr  = [
    ('ref', 'Reference|Referenz|Referenz-Nr|Referenznummer'),
    ('color', 'Color|color|tinta|farbe|Farbe')
]

ImportAttr  = [
    ('Referenz', 'Ref-Val'),
    ('color', 'red'),
]

ref, color = None, None

for var, names in MyAttr:
    for key, value in ImportAttr:
        if key in names.split('|'):
            locals()[var] = value
            break

print("ref: ", ref)
print("color: ", color)

Upvotes: 1

hygull
hygull

Reputation: 8740

If you want, you can also use pandas to solve this problem for the large data sets in this way.

get_references_and_colors.py
import pandas as pd
import re
import json

def get_references_and_colors(lookups, attrs):
    responses = []

    refs = pd.Series(re.split(r"\|", lookups[0][0]))
    colors = pd.Series(re.split(r"\|", lookups[1][0]))
    d = {"ref": refs, "color": colors}
    df = pd.DataFrame(d).fillna('') # To drop NaN entries, in case if refs 
                                    # & colors are not of same length 

    #               ref  color
    # 0       Reference  Color
    # 1        Referenz  color
    # 2     Referenz-Nr  tinta
    # 3  Referenznummer  farbe
    # 4                  Farbe

    for key, value in attrs:
        response = {}
        response["for_attr"] = key

        df2 = df.loc[df["ref"] == key]; # find in 'ref' column

        if not df2.empty:
            response["ref"] = value
        else:
            df3 = df.loc[df["color"] == key]; # find in 'color' column
            if not df3.empty:
                response["color"] = value
            else:
                response["color"] = None # Not Available
                response["ref"] = None

        responses.append(response)

    return responses


if __name__ == "__main__":

    LOOKUPS  = [
         ('Reference|Referenz|Referenz-Nr|Referenznummer', 'a'),
         ('Color|color|tinta|farbe|Farbe', 'b'),
    ]

    ATTR  = [
        ('Referenz', 'Ref-Val'),
        ('color', 'red'),
        ('color2', 'orange'), # improper
        ('tinta', 'Tinta-col')
    ]

    responses = get_references_and_colors(LOOKUPS, ATTR) # dictionary
    pretty_response = json.dumps(responses, indent=4) # for pretty printing
    print(pretty_response)
Output
[
    {
        "for_attr": "Referenz",
        "ref": "Ref-Val"
    },
    {
        "for_attr": "color",
        "color": "red"
    },
    {
        "for_attr": "color2",
        "color": null,
        "ref": null
    },
    {
        "for_attr": "tinta",
        "color": "Tinta-col"
    }
]

Upvotes: 0

Related Questions