Reputation: 281
I have a collection of 40-50 text files that contain markdown. Some of them contain duplicate words, sentences, and paragraphs. I'm looking for a script/algorithm to scan the files and help me identify matches (or near matches). Where can I find such a thing? Searching for this type of thing online yielded results for other types of problems, but not this one. Would appreciate any clues to help me narrow my search...
Upvotes: 0
Views: 295
Reputation: 300
basically, a simple brute forces can solve all of your problems. But you should consider another algorithms depend on your requirement (timing, memory,...): Boyer–Moore, Rabin–Karp string search algorithm, Knuth–Morris–Pratt algorithm.
Upvotes: 1