Palimondo
Palimondo

Reputation: 7411

Type conversion for namedtuple fields during initialization

I have a class that has few static fields and is initialized from iterable (like output from csvreader). The __init__ performs type conversion from strings to numbers for some of them:

class PerformanceTestResult(object):
    def __init__(self, csv_row):
        # csv_row[0] is just an ordinal number of the test - skip that
        self.name = csv_row[1]          # Name of the performance test
        self.samples = int(csv_row[2])  # Number of measurement samples taken
        self.min = int(csv_row[3])      # Minimum runtime (ms)
        self.max = int(csv_row[4])      # Maximum runtime (ms)
        self.mean = int(csv_row[5])     # Mean (average) runtime (ms)
        self.sd = float(csv_row[6])     # Standard deviation (ms)

I’m thinking about converting it to be just a namedtuple, as there is not much else to it. But I would like to maintain the type conversion during initialization. Is there a way to do this with namedtuple? (I haven’t noticed __init__ method in the verbose output from namedtuple factory method, which gives me pause about how the default initializer works.)

Upvotes: 8

Views: 2187

Answers (1)

Mahi
Mahi

Reputation: 21932

Instead of passing in the csv_row as-is, like you currently do, you could unpack it using the unpack operator *. For example:

>>> def f(a, b):
...     return a + b
...
>>> csv_row = [1, 2]
>>> f(*csv_row)  # Instead of your current f(csv_row)

This would also work with namedtuple, since the order of arguments will be kept upon unpacking:

>>> from collections import namedtuple
>>> PerformanceTestResult = namedtuple('PerformanceTestResult', [
...     'name',
...     'samples',
...     'min',
...     'max',
...     'mean',
...     'sd',
... ])
>>> test_row = ['test', '123', 2, 5, 3, None]  # from your csv file
>>> ptr = PerformanceTestResult(*test_row)
>>> ptr
PerformanceTestResult(name='test', samples='123', min=2, max=5, mean=3, sd=None)

Not only does this allow you to use namedtuple, which seems like a really good idea here, but it also removes the need for your PerformanceTestResult to know anything about the CSV file! Abstraction is good, since now you can use this same class regardless of where the data comes from and in what format.


If you need the int() and float() conversions, you're gonna have to write a separate conversion function. You can either build it into the PerformanceTestResult by subclassing:

_PerformanceTestResult = namedtuple('PerformanceTestResult', [...])

class PerformanceTestResult(_PerformanceTestResult):
    @classmethod
    def from_csv(cls, row):
        return cls(
            row[0],
            int(row[1]),
            int(row[2]),
            int(row[3]),
            int(row[4]),
            int(row[5]),
            float(row[6])
        )

Which can be used like so:

>>> ptr = PerformanceTestResult.from_csv(your_csv_row)

Or you can create a separate conversion function:

def parse_csv_row(row):
    return (row[0], int(row[1]), ...)

And now use this to convert the row before unpacking:

>>> ptr = PerformanceTestResult(*parse_csv_row(your_csv_row))

Upvotes: 3

Related Questions