Reputation: 1180
I have a long list of groups in json and I want a little utility:
def verify_group(group_id):
group_ids = set()
for grp in groups:
group_ids.add(grp.get("pk"))
return group_id in group_ids
The obvious approach is to load the set outside the function, or otherwise declare a global -- but let's assume I don't want a global variable.
In statically typed languages I can say that the set is static and, I believe that will accomplish my aim. How would one do something similar in python? That is : the first call initializes the set, group_ids, subsequent calls use the set initialized in the first call.
BTW, when I use the profilestats package to profile this little code snippet, I see these frightening results:
ncalls tottime percall cumtime percall filename:lineno(function)
833 0.613 0.001 1.059 0.001 verify_users_groups.py:25(verify_group)
2558976 0.253 0.000 0.253 0.000 {method 'get' of 'dict' objects}
2558976 0.193 0.000 0.193 0.000 {method 'add' of 'set' objects}
I tried adding functools.lru_cache -- but that type of optimization doesn't address my present question -- how can I load the set group_ids once inside a def block?
Thank you for your time.
Upvotes: 2
Views: 207
Reputation: 101929
There isn't an equivalent of static
, however you can achieve the same effect in different ways:
One way is to abuse the infamous mutable default argument:
def verify_group(group_id, group_ids=set()):
if not group_ids:
group_ids.update(grp.get("pk") for grp in groups)
return group_id in group_ids
This however allows the caller to change that value (which may be a feature or a bug for you).
I usually prefer using a closure:
def make_group_verifier():
group_ids = {grp.get("pk") for grp in groups}
def verify_group(group_id):
# nonlocal group_ids # if you need to change its value
return group_id in group_ids
return verify_group
verify_group = make_group_verifier()
Python is an OOP language. What you describe is an instance method. Initialize the class with the set and call the method on the instance.
class GroupVerifier:
def __init__(self):
self.group_ids = {grp.get("pk") for grp in groups}
def verify(self, group_id):
# could be __call__
return group_id in self.group_ids
I'd also like to add that it depends by the API design. You could let the take the responsibility of pre-computing and providing the value if they want performance. This is the choice taken by, for example, random.choices
. The cum_weights
parameter isn't necessary but it allows the user to remove the cost of computing that array for every call in performance critical code. So instead of having a mutable argument you use None
as default and compute that set only if the value passed is None
otherwise you assume the caller did the work for you.
Upvotes: 3