Reputation: 27276
Part of my current project is comparing search results (filenames only) with a search string (multiple words). I have a very basic mechanism I use at the moment to identify the relevance of a result, all handled with one function.
When the search begins, I split the search string into a string list of keywords...
procedure TSearcherThread.ParseKeywords;
var
S, T: String;
P: Integer;
begin
//Clear current list of keywords
FKeywords.Clear;
S:= LowerCase(Trim(FSearchString));
//Remove all excess spaces
while Pos(' ', S) > 1 do
S:= StringReplace(S, ' ', ' ', [rfReplaceAll]);
if Copy(S, Length(S)-1, 1) <> ';' then
S:= S + ';';
//Parse out keywords
while Length(S) > 0 do begin
P:= Pos(';', S);
T:= Copy(S, 1, P-1);
Delete(S, 1, P);
FKeywords.Append(T);
end;
end;
Now when I'm iterating through the master list of files to be searched, I pass each filename into this function...
function TSearcherThread.MatchKeywords(const Filename: String): Single;
var
S: String; //Temp keywords
FN: String; //Filename
X: Integer; //Iterator
C: Integer; //Match counter
begin
Result:= 0; //Default no match
S:= Trim(LowerCase(FSearchString)); //Lowercase Keywords, trim outside spaces
FN:= LowerCase(ExtractFileName(Filename)); //Get lowercase filename
Delete(FN, Pos('.', FN), MAXINT); //Strip off extension leaving only the name
//Check if exact match
if FN = S then Result:= 2;
//If nothing matches yet, then look for individual keywords...
if Result < 2 then begin
C:= 0;
if FKeywords.Count > 0 then begin
//Iterate through keywords
for X := 0 to FKeywords.Count - 1 do begin
//If keyword is found in filename
if Pos(FKeywords[X], FN) > 0 then begin
Inc(C);
end;
end;
//Return how often keywords showed up
Result:= C / FKeywords.Count;
end;
end;
end;
How this works is the function passes back a decimal number of relevance. A result of 0 means no match, between 0 and 1 means partial match, where the higher the number, the better of a match, 1 means all keywords were found, and 2 means it's an exact match. I can also do a comparison to only include results which have a certain percentage, like this:
M:= MatchKeywords(Filename);
if M >= 0.2 then AddResult(Filename);
The problem is that my method above considers only AND operation, meaning it expects all the keywords, and compares how many keywords were found. However, I would like to also implement combinations of both AND and OR operations together, which my structure doesn't support. So I need to re-write the guts of this function to make this possible.
What I would like to know is not how to write this, but is there something in Delphi which can make this possible? Someone mentioned to me TDictionary
as a Hash Table is what I would need, but I have no clue how it relates to what I'm doing, as I've never used them. I just don't want to re-invent the wheel of pattern matching if it already exists in Delphi XE2.
Upvotes: 1
Views: 633
Reputation: 34
the simplest way finding some string patterns is using Regular Expressions Engine. You could find some free units and packages on the web site of FPC.
http://wiki.freepascal.org/Regexpr
And read more about Regular Expression.
Upvotes: 1