Reputation: 9265
Given two sequences of identifier, how to find the smallest operation sequence that will transform the first sequence of identifier to the second one.
Operation can be :
Note: identifiers are unique and can't appear twice in a sequence
Example:
Sequence1 [1, 2, 3, 4, 5]
Sequence2 [5, 1, 2, 9, 3, 7]
Result (index are 0 based) :
- Remove at 3
- Move from 3 to 0
- Insert '9' at 3
- Insert '7' at 5
Thanks !
Upvotes: 4
Views: 114
Reputation: 726849
Start by finding the longest common subsequence. This will identify the elements that will not move:
[(1), (2), (3), 4, 5]
Elements of the LCS are enclosed in parentheses.
Go through both sequences from index 0, recording the operations required to make the sequences identical. If the current item of the first sequence is not part of the LCS, remove it, and mark the place where it has been before, in case you need to insert it at a later time. If the current element is part of the LCS, insert the element from the second sequence in front of it. This could be either a simple insertion, or a move. If the item that you are inserting is in the original list, make it a move; otherwise, make it an insert.
Here is a demo using your example. Curly braces show the current element
[{(1)}, (2), (3), 4, 5] vs [{5}, 1, 2, 9, 3, 7]
1
is a member of LCS, so we must insert 5
. 5
is in the original sequence, so we record a move: MOVE 4 to 0
[5, {(1)}, (2), (3), 4] vs [5, {1}, 2, 9, 3, 7]
Items are the same, so we move on to the next one:
[5, (1), {(2)}, (3), 4] vs [5, 1, {2}, 9, 3, 7]
Again the numbers are the same - move to the next one:
[5, (1), (2), {(3)}, 4] vs [5, 1, 2, {9}, 3, 7]
3
is a member of LCS, so we must insert 9
. The original element does not have 9
, so it's a simple insertion: INSERT 9 at 3
[5, (1), (2), 9, {(3)}, 4] vs [5, 1, 2, 9, {3}, 7]
Yet again the numbers are the same - move to the next one:
[5, (1), (2), 9, (3), {4}] vs [5, 1, 2, 9, 3, {7}]
'4' is not a member of LCS, so it gets deleted: DEL at 5
[5, (1), (2), 9, (3)] vs [5, 1, 2, 9, 3, {7}]
We reached the end of the first sequence - we simply add the remaining items of the second sequence to the first one, paying attention to the list of prior deletions. For example, if 7
had been removed earlier, we would transform that deletion into a move at this time. But since the original list did not have 7
, we record our final operation: INS 7 at 5
.
Upvotes: 1
Reputation: 31663
This metric is called Levenshtein distance or more precisely Damerau–Levenshtein distance.
There are implementations for almost every possible programming language, that you can use resolve the problem you described.
Upvotes: 1