Reputation: 21883
I have a structure like this (pseudo code):
class Player {
steamid: str
hero: Hero
}
class Hero {
class_id: str
level: int
xp: int
skills: list[Skill]
}
class Skill {
class_id: str
level: int
}
Now I'm trying to store it into a database, and I gave my player a get_serialized_data()
method which returns a tuple like so:
return (
# players
(steamid, hero.class_id),
# heroes
(steamid, hero.class_id, hero.level, hero.xp),
# skills
(
(steamid, hero.class_id, skill.class_id, skill.level)
for skill in hero.skills
),
)
Finally, I'm simultaneously storing every players' data into the database, and using three calls to executemany()
to save:
executemany()
executemany()
executemany()
And here's my code to do that:
def save_all_data(*, commit=True):
"""Save every active player's data into the database."""
players_data = []
heroes_data = []
skills_data = []
for player in _players.values():
player_data, hero_data, skills_data_ = player.get_serialized_data()
players_data.append(player_data)
heroes_data.append(heroes_data)
skills_data.extend(skills_data_)
_database.save_players(players_data)
_database.save_heroes(heroes_data)
_database.save_skills(skills_data)
if commit:
_database.commit()
The "problem", as you can see, is that I construct three large lists. Is it possible to replace these lists with generators somehow? My _database.save_X()
methods all accept generators, so it would save a lot of RAM.
Edit: Also, I don't want to loop through the players three times. So I'd love to get three generators somehow during one loop.
Upvotes: 2
Views: 135
Reputation: 104682
There's no way to avoid storing O(len(players))
worth of data if you want to save the sets of your player, hero and skill data in separate operations on the database (rather than doing one operation for each player with their associated hero and skill data, or saving it all somehow in parallel).
Generators won't help you here. Even if you could come up with a generator that returned the hero and skill data, it would need to maintain a list (or some other data structure) in the background unless your three database saves were happening in parallel. You might want to compare what you're asking for to the implementation of itertools.tee
, which creates several "copies" of an input iterator. It's only space efficient if you're iterating over the copies in parallel (with for instance, zip
), rather than one by one. If you're iterating over the copies one by one, it's essentially the same as copying the iterator's contents into a list and iterating over that repeatedly.
Upvotes: 2