Reputation: 51
I have been stuck on a following question for a while. This problem is a classic Word ladder problem with the slight modifications. I need to write a function which would return an index of the first_word
in some given undirected graph (where each edge has weight 1) such that it creates the overall shortest word ladders to each of the words in desired_words
(to be more precise it returns an index of the word for which the longest word ladder to any of the words in desired_words
is as short as possible); if multiple words are possible it would just return one of them.
Here is some quick example: Let's say I have following graph
Let's say I have following list, lst = ["aaa","aad","dad","daa","aca","acc","aab","abb"]
, After that we create an instance of the class of Graph. And if for example our desired_words
are dad, abb, acc
then our function would return 0 because the best first_word
in this case would be aaa
. If we run our function on aaa, aca, acc
then it would return 4 because in this case the best first_word
would be aca
.
I was thinking of various approaches: run Dijkstra' algorithm on each node, and then find the shortest distance to each of the desired_words
, or find a graph (Jordan) center, of find a lowest common ancestor between the nodes, but none of them seem to make sense to me. I just don't know how can I apply any of the known algorithms, such that it would return the most optimal node.
Can anyone suggest what would be the correct approach to tackle this problem? Just a general idea, or some step-by-step algorithm. Thank you very much.
P.S. python code would be helpful but not necessary.
Upvotes: 1
Views: 243
Reputation: 350137
You could launch a separate BFS sourced at each of the desired nodes, ... but in sync. As soon as you find a node that has been visited from all of the BFS sources, you have found a solution.
The key is thus to maintain for each node a visited
flag per source. This could be a set of source nodes (a source node is at the start of a BFS).
Here is an implementation. The first part builds the graph (as adjacency list), and the second part performs the multiple BFS:
from collections import defaultdict
def ladder(words):
# Group words when they share a n-1 subsequence
common = [defaultdict(list) for _ in words[0]]
for word in words:
for i, ch in enumerate(word):
# i-th character removed is key
common[i][word[:i] + word[i+1:]].append(word)
# Create the graph from those word groups:
graph = defaultdict(list)
for dct in common:
for words in dct.values():
for i, word in enumerate(words):
graph[word].extend(words[:i] + words[i+1:])
return dict(graph)
def getfirstword(ladder, desiredwords):
visited = { # For each word: one flag per BFS source
word: set()
for word in graph
}
desiredwords = set(desiredwords) # Remove duplicates
n = len(desiredwords)
frontiers = { # One entry per BFS source
word: [word]
for word in desiredwords
}
for word in desiredwords:
visited[word].add(word)
while any(frontiers): # If ever False, then graph is disconnected
for source, words in frontiers.items():
frontier = []
for word in words:
for neighbor in graph[word]:
if source not in visited[neighbor]:
visited[neighbor].add(source)
if len(visited[neighbor]) == n:
return neighbor
frontier.append(neighbor)
frontiers[source] = frontier
Your example graph is created as follows:
words = ["aaa","aad","dad","daa","aca","acc","aab","abb"]
graph = ladder(words)
print("Adjacency list:")
for word, neighbors in graph.items():
print(f"'{word}': {neighbors}")
This outputs:
Adjacency list:
'aaa': ['daa', 'aca', 'aad', 'aab']
'daa': ['aaa', 'dad']
'aad': ['dad', 'aaa', 'aab']
'dad': ['aad', 'daa']
'aca': ['aaa', 'acc']
'acc': ['aca']
'aab': ['abb', 'aaa', 'aad']
'abb': ['aab']
The two example queries you discussed can be made as follows:
queries = [
["dad", "abb", "acc"],
["aaa", "aca", "acc"]
]
for query in queries:
print(f"For desired words {query} a solution is: '{getfirstword(graph, query)}'")
This outputs:
For desired words ['dad', 'abb', 'acc'] a solution is: 'aaa'
For desired words ['aaa', 'aca', 'acc'] a solution is: 'aca'
Upvotes: 2