Reputation: 867
I'm trying to solve the Euler Projects as an exercise to learn Python, for the last few days after work, and am now at Problem 18
I looked at the problem, and thought it could be solved by using Dijkstra's algorithm, with the values of the nodes as negative integers, so as to find the "longest" path.
My solution seems to be almost correct (I get 1068) - which is to say wrong. It prints a path, but from what I can tell, it's not the right one. But having looked at it from some time, I can't tell why.
Perhaps this problem cannot be solved by my approach, and I need some other approach, like dynamic programming - or maybe my implementation of Dijkstra is faulty?
I'm pretty confident the parsing from file to graph is working as intended.
This is the data-set:
75
95 64
17 47 82
18 35 87 10
20 04 82 47 65
19 01 23 75 03 34
88 02 77 73 07 63 67
99 65 04 28 06 16 70 92
41 41 26 56 83 40 80 70 33
41 48 72 33 47 32 37 16 94 29
53 71 44 65 25 43 91 52 97 51 14
70 11 33 28 77 73 17 78 39 68 17 57
91 71 52 38 17 14 91 43 58 50 27 29 48
63 66 04 68 89 53 67 30 73 16 69 87 40 31
04 62 98 27 23 09 70 98 73 93 38 53 60 04 23
This is the code. It a fully "working example", as long as the path to the file with the content above is correct.
class Graph:
def __init__(self):
self.nodes = []
self.edges = []
def add_node(self, node):
self.nodes.append(node)
def add_edge(self, edge):
self.edges.append(edge)
def edges_to_node(self, n):
edges = [edge for edge in self.edges if edge.node1.id == n.id]
return edges
class Node:
def __init__(self, id, value, goal):
self.id = id
self.value = value
self.goal = goal
self.visited = False
self.distance = 10000
self.previous = None
def __str__(self):
return "{} - {}".format(self.value, self.goal)
def __repr__(self):
return "{} - {}".format(self.value, self.goal)
class Edge:
def __init__(self, node1, node2):
self.node1 = node1
self.node2 = node2
f = open("problem18.data", "r")
content = f.read()
lines = content.split("\n")
data = []
graph = Graph()
index_generator = 1
last_line = len(lines) - 1
for i in range(len(lines)):
data.append([])
numbers = lines[i].split()
for number in numbers:
goal = i == last_line
data[-1].append(Node(index_generator, -int(number), goal))
index_generator += 1
for i in range(len(data)):
for j in range(len(data[i])):
node = data[i][j]
graph.add_node(node)
if i != last_line:
node2 = data[i+1][j]
node3 = data[i+1][j+1]
edge1 = Edge(node, node2)
edge2 = Edge(node, node3)
graph.add_edge(edge1)
graph.add_edge(edge2)
def dijkstra(graph, start):
start.distance = 0
queue = [start]
while len(queue):
queue.sort(key=lambda x: x.value, reverse=True)
current = queue.pop()
current.visited = True
if current.goal:
return reconstrcut_path(start, current)
edges = graph.edges_to_node(current)
for edge in edges:
neighbour = edge.node2
if neighbour.visited:
continue
queue.append(neighbour)
new_distance = current.distance + neighbour.value
if new_distance < neighbour.distance:
neighbour.distance = new_distance
neighbour.previous = current
return []
def reconstrcut_path(start, n):
path = []
current = n
while current.id is not start.id:
path.append(current)
current = current.previous
path.append(start)
return path
path = dijkstra(graph, graph.nodes[0])
tally = 0
for node in path:
number = max(node.value, -node.value)
print(number)
tally += number
print(tally)
Can you help me troubleshoot what is wrong with this solution?
EDIT: The console output of the run:
98
67
91
73
43
47
83
28
73
75
82
87
82
64
75
1068
Upvotes: 1
Views: 306
Reputation: 64904
Consider this alternative data set:
01
00 01
00 00 01
00 00 00 01
99 00 00 00 01
At least with negated costs, Dijkstra would explore that path of 1s and the zeroes that are "just off the path", but nothing else. The node that takes an other step down that path of 1s is always the best node in the queue, and it ends in a goal node so the algorithm terminates without exploring the rest of the triangle. It would never even see that there is a 99 hiding in the bottom left corner.
Upvotes: 1
Reputation: 77837
Actually, dynamic programming will knock this off neatly. My solution for this and problem 67 is less than 20 lines.
The focus here is very much a Dijkstra approach: work your way down the triangle, maintaining the maximum path cost at each node. Row 1 is trivial:
75
Row 2 is similarly trivial, as both values are ends: each has only one possible path:
95+75 64+75
which evaluates to
170 139
Row 3 has two ends, but the middle value gives us the critical logic: keep the larger of the two paths:
17+170 47+max(170, 139) 82+139
187 217 221
Row 4 has two middles ... just keep going with the process:
18+187 35+max(187, 217) 87+max(217, 221) 10+221
205 252 308 231
Can you take it from here?
As a check for you, the correct answer is quite close to the one you originally got.
Your solution fails because you didn't apply Dijkstra's algorithm. That requires that you maintain the best path to each node you've reached in your search. Instead, you used a row-by-row greedy algotriothm: you kept only the best path so far in the entire pass.
Specifically, when you found the 98
near the right side of the bottom row, you forced an assumption that it was part of the optimum path. You continued this, row by row. The data set is configured specifically to make this approach fail. The best path starts with the 93 + 73 + 58
sequence.
You have to keep all paths in mind; there's a path that is not the best sum for the bottom couple of rows, but catches up in the middle rows while the "fat" path gets starved with some lower numbers in the middle.
Upvotes: 4