Reputation: 5251

Python duplicate code removal

According to CodeClimate, the two methods in the simplified class below are duplicates of one another with a mass of about 40. What's the best way to refactor to remove the duplication? The equivalent dictionary comprehension has a very marginally lower mass, but is otherwise identical.

 class DataAdaptor:
     def __init__(self):
         self._feeds = {'field1': 'temperature', 'field2': 'humidity'}

     def parse_data(self, data):
         content = {}
         for field, feed in self._feeds.items():
             if feed in data:
                 content[field] = data[feed]
         return content

     def parse_content(self, content):
         data = {}
         for field, feed in self._feeds.items():
             if field in content:
                 data[feed] = content[field]
         return data

Clarification: The version with dictionary comprehensions has almost exactly the same duplication mass but I think it's slightly less clear.

 class DataAdaptor:
     def __init__(self):
         self._feeds = {'field1': 'temperature', 'field2': 'humidity'}

     def parse_data(self, data):
         return {field: data[feed] for field, feed in self._feeds.items() if feed in data}

     def parse_content(self, content):
         return {feed: content[field] for field, feed in self._feeds.items() if field in content}

This is a green field development, so we're free to do anything.

Upvotes: 0

Answers (5)

Nzbuu

Reputation: 5251

The alternative that I'm considering is to store the reverse look-up dictionary as well.

 class DataAdaptor:
     def __init__(self):
         self._feeds = {'field1': 'temperature', 'field2': 'humidity'}
         self._fields = {'temperature': 'field1', 'humidity': 'field2'}

     def parse_data(self, data):
         return self._reform_data(data, self._fields)

     def parse_content(self, content):
         return self._reform_data(content, self._feeds)

     def _reform_data(self, data, names)
         return {names[k]: data[k] for k in names if k in data}

Or is there an easier way to swap the keys and values in the dictionary?

Upvotes: 1

Eugene

Reputation: 656

Instead of unpacking item into k v pair, we can address k and v by index and magically invert index when needed

 def _parse_lol(self, source, num)
     result = {}
     for item in self._feeds.items():
         if item[1] in source:
             result[num] = source[1-num]
     return result

 def parse_data(self, data):
     return self.parse_lol(data, 0)

 def parse_content(self, content):
     return self.parse_lol(data, 1)

I may misaligned 0 and 1,

UPD

 def parse_parse(self, **kwargs):
     direction = ["content", "data"].index(kwargs.keys()[0])
     source = kwargs.values()[0]
     result = {}
     for item in self._feeds.items():
         if item[1] in source:
             result[item[direction]] = source[item[1-direction]]
     return result

Upvotes: 0

Paul Cornelius

Reputation: 10936

I have read the first three answers and I would like to offer something different. Don't re-write those functions. They're fine as they are. They are short, easy to understand and easy to read. Why mess with them? I've never used CodeClimate and have no idea what a mass of 40 means, but I think it's a mistake to regard any static code-checking tool as the absolute final authority on how a piece of software should be written. I bet you have bigger things to worry about than partial duplication in a simple little function, which as you already said could be written as a single dictionary comprehension.

I have one suggestion, however: change the names of the two functions to something like select_by_matching_values and select_by_matching_keys. That gives visibility to what they do and how they differ from each other.

Upvotes: 3

Padraic Cunningham

Reputation: 180401

The only way I see is to use kwargs and pass in the data as keywords and use the logic based on that:

class DataAdaptor:
    def __init__(self):
        self._feeds = {'field1': 'temperature', 'field2': 'humidity'}

    def parse_content(self, **kwargs):
        data, content = kwargs.get("data", {}), kwargs.get("content", {})
        return {k: data[v] for k, v in self._feeds.items() if v in data} if data\
            else {self._feeds[k]: content[k] for k in self._feeds.keys() & content}

You can also use self._feeds.keys() & data to get the common keys, for python2 you would need to change it to self._feeds.viewkeys() & data

You could also use named arguments:

class DataAdaptor:
    def __init__(self):
        self._feeds = {'field1': 'temperature', 'field2': 'humidity'}

    def parse_content(self, data=None, content=None):
        return {k: data[v] for k, v in self._feeds.items() if v in data} if data\
            else {self._feeds[k]: content[k] for k in self._feeds.keys() & content}

Although personally I would prefer to use the two methods.

Upvotes: 0

Prune

Reputation: 77837

The two aren't entirely identical: they copy references in opposite directions. However, writing a simple conditional dictionary comprehension is probably the best way to go. Changing the input parameter to accent the similarity:

parse_data(self, foo):
    return {field: foo[feed] for field, feed in self._feeds.iteritems() \
                                 if feed in foo}

... where parse_content is subtly different in what's checked and returned.

parse_content(self, foo):
    return {feed: foo[field] for field, feed in self._feeds.iteritems() \
                                 if field in foo}

Does this help?

Upvotes: 0

Python duplicate code removal

Answers (5)

Related Questions