Reputation: 1
This a Google interview questions that is now a problem in Google's Foundations of Programming course. I was trying to understand the solution to the following problem:
"Given a string S and a set of words D, find the longest word in D that is a subsequence of S. Word W is a subsequence of S if some number of characters, possibly zero, can be deleted from S to form W, without reordering the remaining characters. Note: D can appear in any format (list, hash table, prefix tree, etc. For example, given the input of S = "abppplee" and D = {"able", "ale", "apple", "bale", "kangaroo"} the correct output would be "apple" "
So the main doubt I have is in the section the specifies An Optimal method O(N+L) for small alphabet. In the previous approach i.e O(N + L log N), the author builds a hash table mapping characters -> sorted indexes in String
So for "abppple", the hash table is:
a -> [0]
b -> [1]
p -> [2,3,4]
l -> [5]
e -> [6]
In the optimal approach for small alphabet, it is mentioned using a dense vector representation instead of the above sparse vector representation.
p -> [2, 2, 3, 4, -1, -1, -1, -1]
What are they trying to represent? The string S = "abppplee" does have 8 characters but I don't get what the dense vector is representing? Is it an error or am I missing something?
Thank you.
Upvotes: 0
Views: 954
Reputation: 13087
As far as I understand it, this vector should address the question (with Y
set to character p)
Given my last matched character was at index X and my next character to match is Y, where does this match occur?
In other words, if you are at index i
in the original string, the element of the vector p[i]
tells you where is the closest next character p in the string.
To be more specific, the original string is s=abppplee
and we want to verify:
p -> [2, 2, 3, 4, -1, -1, -1, -1]
This means that if we are at index 0
, the closest p is at index 2
. For index 1
, the closest p is also at 2
. If we are at index 2
, the closest next p is at position 3
, etc. The values -1
signify that there is no p in the rest of the string (for example at position 4
, the remaining characters are indeed just lee
, thus there is no p).
Upvotes: 1