Curtwagner1984
Curtwagner1984

Reputation: 2088

How to merge data from object A into object B in Python?

I'm trying to figure out if there's a procedural way to merge data from object A to object B without manually setting it up.

For example, I have the following pydantic model which represents results of an API call to The Movie Database:

class PersonScraperReply(BaseModel):
    """Represents a Person Scraper Reply"""

    scraper_name: str
    """Name of the scraper used to scrape this data"""

    local_person_id: int
    """Id of person in local database"""

    local_person_name: str
    """name of person in local database"""

    aliases: Optional[list[str]] = None
    """list of strings that represent the person's aliases obtained from scraper"""

    description: Optional[str] = None
    """String description of the person obtained from scraper"""

    date_of_birth: Optional[date] = None
    """Date of birth of the person obtained from scraper"""

    date_of_death: Optional[date] = None
    """Date the person passed away obtained from scraper"""

    gender: Optional[GenderEnum] = None
    """Gender of the person obtained from scraper"""

    homepage: Optional[str] = None
    """Person's official homepage obtained from scraper"""

    place_of_birth: Optional[str] = None
    """Location where the person wsa born obtained from scraper"""

    profile_image_url: Optional[str] = None
    """Url for person's profile image obtained from scraper"""

    additional_images: Optional[list[str]] = None
    """List of urls for additional images for the person obtained from scraper"""

    scrape_status: ScrapeStatus
    """status of scraping. Success or failure"""

I also have this SQLAlchemy class that represents a person in my database:

class PersonInDatabase(Base):

    id: int
    """Person Id"""

    name: str
    """Person Name"""
    
    description: str = Column(String)
    """Description of the person"""

    gender: GenderEnum = Column(Enum(GenderEnum), nullable=False, default=GenderEnum.unspecified)
    """Person's gender, 0=unspecified, 1=male, 2=female, 3=non-binary"""

    tmdb_id: int = Column(Integer)
    """Tmdb id"""

    imdb_id: str = Column(String)
    """IMDB id, in the format of nn[alphanumeric id]"""

    place_of_birth: str = Column(String)
    """Place of person's birth"""

    # dates
    date_of_birth: DateTime = Column(DateTime)
    """Date the person was born"""

    date_of_death: DateTime = Column(DateTime)
    """Date the person passed away"""

    date_last_person_scrape: DateTime = Column(DateTime)
    """Date last time the person was scraped"""

My goal is to merge the data I received from the API call to the database object. When I say merge I mean assign fields that exist in both objects and do nothing with the rest. Something along the lines of:

person_scrape_reply = PersonScraperReply()
person_in_db = PersonInDatabase()


for field_in_API_name, field_in_API_value in person_scrape_reply.fields: #for field in API response
    if field_in_API_name in person_in_db.field_names and field_in_API_value is not None: #if field exists in PersonInDatabase and the value is not none
        person_in_db.fields[field_in_API_name] = field_in_API_value #assign API response value to field in database class.

Is something like this possible?

Upvotes: 2

Views: 6938

Answers (2)

Curtwagner1984
Curtwagner1984

Reputation: 2088

The method @Daniel suggested (Using attrs) resulted in an error for me, I'm sure it works with regular classes but it results in errors with both SQLAlchemy and Pydantic classes.

After sitting around with the debugger, I saw that both Pydantic and SQLAchemy provide a method to access their field names in string format. In SQLAchemy it's inspect([SQLALCHEMY MAPPED CLASS]).attrs.key and Pydantic just has a built in dict() method. Kind of silly of me to forget about it when a large selling point of pydantic is that it can serialize data classes to JSON.

Anyway, with those two methods, this is what worked for me:

def assing_empty(person_to_assign: Person, scrape_result: PersonScraperReply):
    blacklisted_fields = ["aliases"] #fields to ignore
    person_to_assign_fields = [x.key for x in inspect(person_to_assign).attrs] #SQLAlchemy fields
    scrape_result_fields = [x for x in scrape_result.dict().keys() if x not in blacklisted_fields] #Pydantic fields

    for field_name in scrape_result_fields:
        if field_name in person_to_assign_fields:
            person_to_assign_value = getattr(person_to_assign, field_name)
            scrape_result_value = getattr(scrape_result, field_name)

            if scrape_result_value is not None and person_to_assign_value is None:
                setattr(person_to_assign, field_name, scrape_result_value)

Upvotes: 3

Daniel
Daniel

Reputation: 1995

use the attrs package.

from attrs import define, asdict

@define
class PersonScraperReply(BaseModel):
    """Represents a Person Scraper Reply"""

    scraper_name: str
    """Name of the scraper used to scrape this data"""

    local_person_id: int
    """Id of person in local database"""

    local_person_name: str
    """name of person in local database"""

    aliases: Optional[list[str]] = None
    """list of strings that represent the person's aliases obtained from scraper"""

    description: Optional[str] = None
    """String description of the person obtained from scraper"""

    date_of_birth: Optional[date] = None
    """Date of birth of the person obtained from scraper"""

    date_of_death: Optional[date] = None
    """Date the person passed away obtained from scraper"""

    gender: Optional[GenderEnum] = None
    """Gender of the person obtained from scraper"""

    homepage: Optional[str] = None
    """Person's official homepage obtained from scraper"""

    place_of_birth: Optional[str] = None
    """Location where the person wsa born obtained from scraper"""

    profile_image_url: Optional[str] = None
    """Url for person's profile image obtained from scraper"""

    additional_images: Optional[list[str]] = None
    """List of urls for additional images for the person obtained from scraper"""

    scrape_status: ScrapeStatus
    """status of scraping. Success or failure"""

@define
class PersonInDatabase(Base):

    id: int
    """Person Id"""

    name: str
    """Person Name"""
    
    description: str = Column(String)
    """Description of the person"""

    gender: GenderEnum = Column(Enum(GenderEnum), nullable=False, default=GenderEnum.unspecified)
    """Person's gender, 0=unspecified, 1=male, 2=female, 3=non-binary"""

    tmdb_id: int = Column(Integer)
    """Tmdb id"""

    imdb_id: str = Column(String)
    """IMDB id, in the format of nn[alphanumeric id]"""

    place_of_birth: str = Column(String)
    """Place of person's birth"""

    # dates
    date_of_birth: DateTime = Column(DateTime)
    """Date the person was born"""

    date_of_death: DateTime = Column(DateTime)
    """Date the person passed away"""

    date_last_person_scrape: DateTime = Column(DateTime)
    """Date last time the person was scraped"""


person_scrape_reply = PersonScraperReply()
person_in_db = PersonInDatabase()
scrape_asdict = asdict(person_scrape_reply)
db_asdict = asdict(person_in_db)

for field_in_API_name, field_in_API_value in scrape_asdict.items(): #for field in API response
    if field_in_API_name in db_asdict.keys() and field_in_API_value is not None: #if field exists in PersonInDatabase and the value is not none
        setattr(person_in_db, field_in_API_name, field_in_API_value) #assign API response value to field in database class.

Upvotes: 3

Related Questions