Reputation: 19627
Okay, sorry if my problem seems a bit rough. I'll try to explain it in a figurative way, I hope this is satisfactory.
10 children.
5 boxes.
Each child chooses three boxes.
Each box is opened:
- If it contains something, all children selected this box gets 1 point
- Otherwise, nobody gets a point.
My question is about what I put in bold. Because in my code, there are lots of kids and lots of boxes.
Currently, I proceed as follows:
children = {"child_1" : 0, ... , "child_10": 0}
gp1 = ["child_3", "child_7", "child_10"] #children who selected the box 1
...
gp5 = ["child_2", "child_5", "child_8", "child_10"]
boxes = [(0,gp1), (0,gp2), (1,gp3), (1,gp4), (0,gp5)]
for box in boxes:
if box[0] == 1: #something inside
for child in box[1]:
children[child] += 1
I worry mainly about the for loop that assigns each child an extra point. Because in my final code, I have many many children, I fear that doing so would slow the program too.
Is there a more efficient way for all children of the same group may have their point faster?
Upvotes: 10
Views: 391
Reputation: 42421
Two general points:
(1) Based on what you've told us, there's no reason to focus your energy on minor performance optimizations. Your time would be better spent thinking about ways to make your data structures less awkward and more communicative. A bunch of interrelated dicts, lists, and tuples quickly becomes difficult to maintain. For an alternative, see the example below.
(2) As the game designer, you understand that events follow a certain sequence: first the kids select their boxes, and later they discover whether they get points for them. But you don't have to implement it that way. A kid can choose a box and get points (or not) immediately. If there's a need to preserve the child's ignorance about such outcomes, the parts of your algorithm that depend on such ignorance can enforce that veil of secrecy as needed. The upshot: there is no need for a box to loop through its children, awarding points to each one; instead, award the points immediately to kids as boxes are selected.
import random
class Box(object):
def __init__(self, name):
self.name = name
self.prize = random.randint(0,1)
class Child(object):
def __init__(self, name):
self.name = name
self.boxes = []
self.score = 0
self._score = 0
def choose(self, n, boxes):
bs = random.sample(boxes, n)
for b in bs:
self.boxes.append(b)
self._score += b.prize
def reveal_score(self):
self.score = self._score
boxes = [Box(i) for i in range(5)]
kids = [Child(i) for i in range(10)]
for k in kids:
k.choose(3, boxes)
# Later in the game ...
for k in kids:
k.reveal_score()
print (k.name, k.score), '=>', [(b.name, b.prize) for b in k.boxes]
Upvotes: 1
Reputation: 179452
Represent children as indices into arrays, not as strings:
childrenScores = [0] * 10
gp1 = [2,6,9] # children who selected box 1
...
gp5 = [1,4,7,9]
boxes = [(0,gp1), (0,gp2), (1,gp3), (1,gp4), (0,gp5)]
Then, you can store childrenScores
as a NumPy array and use advanced indexing:
childrenScores = np.zeros(10, dtype=int)
...
for box in boxes:
if box[0]:
childrenScores[box[1]] += 1 # NumPy advanced indexing
This still involves a loop somewhere, but the loop is deep inside NumPy instead, which should provide a meaningful speedup.
Upvotes: 5
Reputation: 3913
What I would do
In the gpX
lists don't save the "name of the child" (e.g. "child_10"
) but save a reference to the child's number of points.
How to do that
Using the fact that lists are objects in python, you can:
children = {"child_0": [0], "child_1": [0], ...}
and so on.gp1.append(children["child_0"])
).for child in box[1]: child[0]+=1
. This WILL update the children
dict.EDIT:
Why this is faster:
Because you leave out the part where you search for children[child]
, which might be costly.
This technique works because by storing the totals in a mutable type, and appending those values to the group lists, both the dict value and each box's list value will point to the same list entries, and changing one will change the other.
Upvotes: 1
Reputation: 1081
If you don't need to immediately print the number of points for every child, you could calculate it on demand, thus saving time. This could help if you only need to query a child every now and then for how many points it has. You can cache each result as you obtain is so you don't go about calculating it again the next time you need it.
Firstly, you'll need to know which groups a child belongs to. We'll store this information as map we'll call childToGroupsMap, which will map each child to an array holding the boxes it belongs to, like so:
childToGroupsMap = {}
for child in children:
childToGroupsMap[child[0]] = []
for box in boxes:
for child in box[1]:
if (box[1] not in childToGroupsMap[child]):
childToGroupsMap[child].append(box[1])
This constructs the reverse map from children to boxes.
It'll also help to have a map from each box to a boolean representing whether it's been opened:
boxToOpenedMap = {}
for box in boxes:
boxToOpenedMap[box[1]] = box[0]
Now, when someone queries how many points a child has, you can go through each of the boxes it belongs to (using childToGroupsMap
, of course), and simply count how many of those boxes have been mapped to 1
in the map boxes
:
def countBoxesForChild(child):
points = 0
for box in childToGroupsMap[child]
if boxToOpenedMap[box] == 1:
points += 1
return points
To make this better, you can cache the resulting number of points. Make a map like so:
childToPointsCalculated = {}
for child in children:
childToPointsCalculated[child[0]] = -1
Where the -1
denotes that we don't yet know how many points this child has.
Finally, you can modify your countBoxesForChild
function to exploit the cache:
def countBoxesForChild(child):
if childToPointsCalculated[child] != -1
return childToPointsCalculated[child]
points = 0
for box in childToGroupsMap[child]
if boxToOpenedMap[box] == 1:
points += 1
childToPointsCalculated[child] = points
return points
Upvotes: 0
Reputation: 48330
The only speed up that I can think of is to use numpy arrays and stream the sum operation.
children[child] += np.ones(len(children[child]))
You should benchmark the operation and see if that is too slow for your business case.
Upvotes: 2
Reputation: 1302
One way or another, you're going to be looping over the children, and your answer appears to avoid looping over children who don't get any points.
It might be slightly faster to use filter or itertools.ifilter to pick the boxes that have something in them:
import itertools
...
for box in itertools.ifilter(lambda x: x[0], boxes):
for child in box[1]
children[child] += 1
Upvotes: 0