Reputation: 1196
I'm trying to implement a simple ORM in python. I'm facing a code duplication issue and I do not know how to solve it. Here is a simplified example of a class in my project:
class Person:
TABLE_NAME = 'person'
FIELDS = [
('name', 'VARCHAR(50)'),
('age', 'INTEGER')
]
# CODE DUPLICATION: the two next lines shoudl be genereated with FIELDS not hard coded...
name: str
age: int
def __init__(self, **kwargs):
self.__dict__ = kwargs
@classmethod
def create_sql_table(cls):
# use TABLE_NAME and FIELDS to create sql table
pass
alice = Person(name='Alice', age=25)
print(alice.name)
If I remove the two lines name: str
and age: int
I lose auto-completion and I get a mypy error on the print line (Error: Person has no attribute name)
But If I keep it, I have code duplication (I write twice each field name).
Is there a way to avoid the code duplication (by generating this two lines using FIELDS variable for instance) ?
Or another way to implement this class that avoid code duplication (without mypy error and auto-completion loss) ?
Upvotes: 0
Views: 85
Reputation: 5458
You can use descriptors:
from typing import Generic, TypeVar, Any, overload, Union
T = TypeVar('T')
class Column(Generic[T]):
sql_type: str # the field type used for this column
def __init__(self) -> None:
self.name = '' # the name of the column
# this is called when the Person class (not the instance) is created
def __set_name__(self, owner: Any, name: str) -> None:
self.name = name # now contains the name of the attribute in the class
# the overload for the case: Person.name -> Column[str]
@overload
def __get__(self, instance: None, owner: Any) -> 'Column[T]': ...
# the overload for the case: Person().name -> str
@overload
def __get__(self, instance: Any, owner: Any) -> T: ...
# the implementation of attribute access
def __get__(self, instance: Any, owner: Any) -> Union[T, 'Column[T]']:
if instance is None:
return self
# implement your attribute access here
return getattr(instance, f'_{self.name}') # type: ignore
# the implementation for setting attributes
def __set__(self, instance: Any, value: T) -> None:
# maybe check here that the type matches
setattr(instance, f'_{self.name}', value)
Now we can create specializations for each column type:
class Integer(Column[int]):
sql_type = 'INTEGER'
class VarChar(Column[str]):
def __init__(self, size: int) -> None:
self.sql_type = f'VARCHAR({size})'
super().__init__()
And when you define the Person
class we can use the column types
class Person:
TABLE_NAME = 'person'
name = VarChar(50)
age = Integer()
def __init__(self, **kwargs: Any) -> None:
for key, value in kwargs.items():
setattr(self, key, value)
@classmethod
def create_sql_table(cls) -> None:
print("CREATE TABLE", cls.TABLE_NAME)
for key, value in vars(cls).items():
if isinstance(value, Column):
print(key, value.sql_type)
Person.create_sql_table()
p = Person(age=10)
print(p.age)
p.age = 20
print(p.age)
This prints:
CREATE TABLE person
name VARCHAR(50)
age INTEGER
10
20
You should probably also create a base Model
class that contains the __init__
and the class method of Person
You can also extend the Column
class to allow nullable columns and add default values.
Mypy does not complain and can correctly infer the types for Person.name
to str and Person.age
to int.
Upvotes: 1
Reputation: 2195
Ok, I ended up with that
class Person:
# this is not full, you need to fill other types you use it with the correct relationship
types = {
str: 'VARCHAR(50)',
int: 'INTEGER',
} # you should extract that out if you use it elsewhere
TABLE_NAME = 'person'
# NOTE: the only annotated fields should be these. if you annotate anything else, It will break
name: str
age: int
def __init__(self, **kwargs):
self.__dict__ = kwargs
@property
def FIELDS(cls):
return [(key, cls.types[value]) for key, value in cls.__annotations__.items()]
alice = Person(name='Alice', age=25)
print(alice.FIELDS) # [('name', 'VARCHAR(50)'), ('age', 'INTEGER')]
And
>>> mypy <module>
>>> Success: no issues found in 1 source file
Upvotes: 1