Reputation: 5251
According to CodeClimate, the two methods in the simplified class below are duplicates of one another with a mass of about 40. What's the best way to refactor to remove the duplication? The equivalent dictionary comprehension has a very marginally lower mass, but is otherwise identical.
class DataAdaptor:
def __init__(self):
self._feeds = {'field1': 'temperature', 'field2': 'humidity'}
def parse_data(self, data):
content = {}
for field, feed in self._feeds.items():
if feed in data:
content[field] = data[feed]
return content
def parse_content(self, content):
data = {}
for field, feed in self._feeds.items():
if field in content:
data[feed] = content[field]
return data
Clarification: The version with dictionary comprehensions has almost exactly the same duplication mass but I think it's slightly less clear.
class DataAdaptor:
def __init__(self):
self._feeds = {'field1': 'temperature', 'field2': 'humidity'}
def parse_data(self, data):
return {field: data[feed] for field, feed in self._feeds.items() if feed in data}
def parse_content(self, content):
return {feed: content[field] for field, feed in self._feeds.items() if field in content}
This is a green field development, so we're free to do anything.
Upvotes: 0
Views: 129
Reputation: 5251
The alternative that I'm considering is to store the reverse look-up dictionary as well.
class DataAdaptor:
def __init__(self):
self._feeds = {'field1': 'temperature', 'field2': 'humidity'}
self._fields = {'temperature': 'field1', 'humidity': 'field2'}
def parse_data(self, data):
return self._reform_data(data, self._fields)
def parse_content(self, content):
return self._reform_data(content, self._feeds)
def _reform_data(self, data, names)
return {names[k]: data[k] for k in names if k in data}
Or is there an easier way to swap the keys and values in the dictionary?
Upvotes: 1
Reputation: 656
Instead of unpacking item into k v pair, we can address k and v by index and magically invert index when needed
def _parse_lol(self, source, num)
result = {}
for item in self._feeds.items():
if item[1] in source:
result[num] = source[1-num]
return result
def parse_data(self, data):
return self.parse_lol(data, 0)
def parse_content(self, content):
return self.parse_lol(data, 1)
I may misaligned 0 and 1,
UPD
def parse_parse(self, **kwargs):
direction = ["content", "data"].index(kwargs.keys()[0])
source = kwargs.values()[0]
result = {}
for item in self._feeds.items():
if item[1] in source:
result[item[direction]] = source[item[1-direction]]
return result
Upvotes: 0
Reputation: 10936
I have read the first three answers and I would like to offer something different. Don't re-write those functions. They're fine as they are. They are short, easy to understand and easy to read. Why mess with them? I've never used CodeClimate and have no idea what a mass of 40 means, but I think it's a mistake to regard any static code-checking tool as the absolute final authority on how a piece of software should be written. I bet you have bigger things to worry about than partial duplication in a simple little function, which as you already said could be written as a single dictionary comprehension.
I have one suggestion, however: change the names of the two functions to something like select_by_matching_values
and select_by_matching_keys
. That gives visibility to what they do and how they differ from each other.
Upvotes: 3
Reputation: 180401
The only way I see is to use kwargs and pass in the data as keywords and use the logic based on that:
class DataAdaptor:
def __init__(self):
self._feeds = {'field1': 'temperature', 'field2': 'humidity'}
def parse_content(self, **kwargs):
data, content = kwargs.get("data", {}), kwargs.get("content", {})
return {k: data[v] for k, v in self._feeds.items() if v in data} if data\
else {self._feeds[k]: content[k] for k in self._feeds.keys() & content}
You can also use self._feeds.keys() & data
to get the common keys, for python2 you would need to change it to self._feeds.viewkeys() & data
You could also use named arguments:
class DataAdaptor:
def __init__(self):
self._feeds = {'field1': 'temperature', 'field2': 'humidity'}
def parse_content(self, data=None, content=None):
return {k: data[v] for k, v in self._feeds.items() if v in data} if data\
else {self._feeds[k]: content[k] for k in self._feeds.keys() & content}
Although personally I would prefer to use the two methods.
Upvotes: 0
Reputation: 77837
The two aren't entirely identical: they copy references in opposite directions. However, writing a simple conditional dictionary comprehension is probably the best way to go. Changing the input parameter to accent the similarity:
parse_data(self, foo):
return {field: foo[feed] for field, feed in self._feeds.iteritems() \
if feed in foo}
... where parse_content is subtly different in what's checked and returned.
parse_content(self, foo):
return {feed: foo[field] for field, feed in self._feeds.iteritems() \
if field in foo}
Does this help?
Upvotes: 0