Johnny
Johnny

Reputation: 549

Finding common elements in two arrays

I’ve declared a type similar to the following.

type
  TLikes = record
    Name            : string[20];
    favColours  : array of string[20];

  faves     = array of TLikes;

Once the records are populated I save them to a binary file so the structure is like that shown below.

[John],     [Green]     [White]     [Blue]
[Paul],     [Blue]      [Red]       [White]     [Green]
[David],    [Red]       [Blue]      [Green]
[Bob],      [White]     [Blue]
[Peter],    [Blue]      [Green]     [Red]

It’s easy to find out what colours David, for example, likes. A small problem occurs when I want the to know who likes blue. So what I’ve done is build a second file, like so …

[Blue],     [John]      [Paul]      [David]     [Peter]         [Bob]
[Red],      [David]     [Paul]      [Peter]
[White],    [Bob]       [David]     [John]      [Paul]
[Green],    [John]      [David]     [Paul]      [Peter]

But something is telling me, I shouldn’t really need to create a second file / data structure, it just seems inefficient.

Here’s a bigger issue ….

What if I need to find who likes any combination of what David likes? My results would be …

Blue and red and green  =   Paul, David, Peter
Blue and red            =   Paul, David, Peter
Blue and green          =   John, Paul, David, Peter
Red and Green           =   Paul, David, Peter

My question is.

Is there a better way to structure the data / records so I can figure out what Bob and Paul have in common (Blue and White) or what red and white have in common (David and Paul) ?

I guess I need to point out that I have tried to simplify the example above. In reality the data for Tlikes.Name will be strings like …

‘decabbadc’
‘bacddbcad’
‘eebadeaac’

There are something in the order of 200k+ of these strings. And the Tlikes.FavColours data is a filename (there are around 2k of these files). The file name indicates a file that contains the Tlikes.Name string.

I want to be able to retrieve a list of file names given a Tlikes.Name string or a list of strings given a file name.

NB – Something is drawing me to ‘sets’ but from the little I understand, I’m limited in the number of elements in sets, am I on the right track ?

Thank you for taking the time to read the post.

Upvotes: 2

Views: 1830

Answers (2)

LU RD
LU RD

Reputation: 34899

Here is a generic set, TSet<T> which could be used as a tool to get relations between your data.

TSet<T> can hold data of simple types, not restricted to byte size as the normal Set type.

Supports:

  • Include (addition)
  • Exclude (subtraction)
  • Intersect (mutual inclusion, multiplication)
  • Symmetrical difference (mutual exclusion, xor)
  • Test for contains (in operator)
  • Test for equality (equal operator)
  • Test for superset of (>= operator)
  • Test for subset of ( <= operator)
  • Sorting
  • BinarySearch

Use TSet<T> to benchmark your application.


unit GenericSet;

interface

Uses
  System.Generics.Defaults,
  System.Generics.Collections;

Type
  TSet<T> = record
    // Include (union)
    class operator Add(const aSet: TSet<T>; aValue: T) : TSet<T>; overload;
    class operator Add(const aSet: TSet<T>; const aTArr: TArray<T>) : TSet<T>; overload;
    class operator Add(const aSet1: TSet<T>; const aSet2: TSet<T>) : TSet<T>; overload;
    // Exclude
    class operator Subtract(const aSet: TSet<T>; aValue: T): TSet<T>; overload;
    class operator Subtract(const aSet: TSet<T>; const aTArr: TArray<T>) : TSet<T>; overload;
    class operator Subtract(const aSet1: TSet<T>; const aSet2: TSet<T>) : TSet<T>; overload;
    // left in right, i.e right.Contains(left)
    class operator In(aValue: T; const aSet: TSet<T>): Boolean; overload;
    class operator In(const aTArr: TArray<T>; const aSet: TSet<T>): Boolean; overload;
    class operator In(const aSet1: TSet<T>; const aSet2: TSet<T>): Boolean; overload;
    // Intersect, mutually common, A and B
    class operator Multiply(const aSet: TSet<T>; aValue: T): TSet<T>; overload;
    class operator Multiply(const aSet: TSet<T>; const aTArr: TArray<T>): TSet<T>; overload;
    class operator Multiply(const aSet1,aSet2 : TSet<T>): TSet<T>; overload;
    // Symmetric difference, A xor B = (A+B) - A.Intersect(B)
    class operator LogicalXor(const aSet: TSet<T>; aValue: T): TSet<T>; overload;
    class operator LogicalXor(const aSet: TSet<T>; aTArr: TArray<T>): TSet<T>; overload;
    class operator LogicalXor(const aSet1,aSet2 : TSet<T>): TSet<T>; overload;
    //
    class operator Equal(const aSet: TSet<T>; aValue: T): Boolean; overload;
    class operator Equal(const aSet: TSet<T>; aTArr: TArray<T>): Boolean; overload;
    class operator Equal(const aSetLeft,aSetRight: TSet<T>): Boolean; overload;
    // SubsetOf (Left <= Right)
    class operator LessThanOrEqual(const aSet: TSet<T>; aValue: T): Boolean; overload;
    class operator LessThanOrEqual(const aSet: TSet<T>; aTArr: TArray<T>): Boolean; overload;
    class operator LessThanOrEqual(const aSetLeft,aSetRight: TSet<T>): Boolean; overload;
    // SupersetOf (Left >= Right)
    class operator GreaterThanOrEqual(const aSet: TSet<T>; aValue: T): Boolean; overload;
    class operator GreaterThanOrEqual(const aSet: TSet<T>; aTArr: TArray<T>): Boolean; overload;
    class operator GreaterThanOrEqual(const aSetLeft,aSetRight: TSet<T>): Boolean; overload;
    // Set creator
    class function Create(const aTArr: array of T; checkDuplicates: Boolean = False): TSet<T>; static;
  private
    FSetArray : array of T;
    FSorted : String; // !! Will be predefined as '' (=False) by compiler.
    function GetEmpty: Boolean; inline;
    function GetItem(index: Integer): T; inline;
    function GetItemCount: Integer; inline;
    function GetSorted: Boolean; inline;
    procedure SetSorted( sorted: Boolean); inline;
  public
    // Add
    procedure Include(aValue: T); overload;
    procedure Include(const aTArr: TArray<T>); overload;
    procedure Include(const aTArr: array of T); overload;
    procedure Include(const aSet: TSet<T>); overload;
    // Subtract; A=[1,2,3]; B=[2,3,4]; B.Exclude(A) = B-A = [4]
    procedure Exclude(aValue: T); overload;
    procedure Exclude(const aTArr: TArray<T>); overload;
    procedure Exclude(const aTArr: array of T); overload;
    procedure Exclude(const aSet: TSet<T>); overload;
    // Multiply (A and B) A=[1,2,3]; B=[2,3,4]; B.Intersect(A) = B*A = [2,3]
    function Intersect(aValue: T): TSet<T>; overload;
    function Intersect(const aTArr: TArray<T>): TSet<T>; overload;
    function Intersect(const aTArr: array of T): TSet<T>; overload;
    function Intersect(const aSet: TSet<T>): TSet<T>; overload;

    // A xor B; A=[1,2,3]; B=[2,3,4]; (A+B)-A.Intersect(B) = [1,4]
    function SymmetricDiff(aValue: T): TSet<T>; overload;
    function SymmetricDiff(const aTArr: TArray<T>): TSet<T>; overload;
    function SymmetricDiff(const aTArr: array of T): TSet<T>; overload;
    function SymmetricDiff(const aSet: TSet<T>): TSet<T>; overload;
    // Identical set
    function Equal(aValue: T): Boolean; overload;
    function Equal(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean; overload;
    function Equal(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean; overload;
    function Equal(const aSet: TSet<T>): Boolean; overload;
    //  Self <= aSet
    function SubsetOf(aValue: T): Boolean; overload;
    function SubsetOf(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean; overload;
    function SubsetOf(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean; overload;
    function SubsetOf(const aSet: TSet<T>): Boolean; overload;
    // Self >= aSet
    function SupersetOf(aValue: T): Boolean; overload;
    function SupersetOf(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean; overload;
    function SupersetOf(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean; overload;
    function SupersetOf(const aSet: TSet<T>): Boolean; overload;
    // Is included
    function Contains(aValue: T): Boolean; overload;
    function Contains(const aTArr: array of T): Boolean; overload;
    function Contains(const aTArr: TArray<T>): Boolean; overload;
    function Contains(const aSet: TSet<T>): Boolean; overload;

    procedure Sort; // QuickSort
    function Search( aValue: T): Boolean; // BinarySearch (Set must be sorted)
    procedure Clear;
    property IsSorted: Boolean read GetSorted;
    property IsEmpty: Boolean read GetEmpty;
    property Items[index: Integer]: T read GetItem; default;
    property ItemCount: Integer read GetItemCount;
  end;

implementation

class function TSet<T>.Create(const aTArr: array of T; checkDuplicates: Boolean = False): TSet<T>;
var
  i,j,elements : Integer;
  duplicate : Boolean;
  c : IEqualityComparer<T>;
begin
  if checkDuplicates then
  begin
    c := TEqualityComparer<T>.Default;
    // This will remove duplicates
    SetLength(Result.FSetArray,Length(aTArr));
    elements := 0;
    for i := 0 to High(aTArr) do
    begin
      duplicate := False;
      for j := 0 to Pred(elements) do
      begin
        duplicate := c.Equals(Result.FSetArray[j],aTArr[i]);
        if duplicate then
          Break;
      end;
      if not duplicate then
      begin
        Result.FSetArray[elements] := aTArr[i];
        Inc(elements);
      end;
    end;
    SetLength(Result.FSetArray,elements);
  end
  else
  begin
    SetLength(Result.FSetArray, Length(aTArr));
    for i := 0 to High(aTArr) do
      Result.FSetArray[i] := aTArr[i];
  end;
end;

class operator TSet<T>.Add(const aSet: TSet<T>; aValue: T): TSet<T>;
begin
  Result := aSet;
  Result.Include(aValue);
end;

class operator TSet<T>.Add(const aSet: TSet<T>; const aTArr: TArray<T>): TSet<T>;
begin
  Result := aSet;
  Result.Include(aTArr);
end;

class operator TSet<T>.Add(const aSet1, aSet2: TSet<T>): TSet<T>;
begin
  Result := aSet1;
  Result.Include(aSet2);
end;

procedure TSet<T>.Include(aValue: T);
begin
  if not Contains(aValue) then begin
    SetLength(FSetArray,Length(FSetArray)+1);
    FSetArray[High(FSetArray)] := aValue;
    SetSorted(False);
  end;
end;

procedure TSet<T>.Include(const aSet: TSet<T>);
begin
  if Self.IsEmpty then
    Self := aSet
  else
    Include(aSet.FSetArray);
end;

procedure TSet<T>.Include(const aTArr: TArray<T>);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Self.Include(aTArr[i]);
end;

procedure TSet<T>.Include(const aTArr: array of T);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Self.Include(aTArr[i]);
end;

procedure TSet<T>.Exclude(const aTArr: TArray<T>);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Exclude(aTArr[i]);
end;

procedure TSet<T>.Exclude(const aTArr: array of T);
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
    Exclude(aTArr[i]);
end;

procedure TSet<T>.Exclude(const aSet: TSet<T>);
begin
  Exclude(aSet.FSetArray);
end;

procedure TSet<T>.Exclude(aValue: T);
var
  i : Integer;
  c : IEqualityComparer<T>;
begin
  c := TEqualityComparer<T>.Default;
  for i := 0 to High(FSetArray) do
  begin
    if c.Equals(FSetArray[i],aValue) then
    begin
      SetLength(FSetArray,Length(FSetArray)); // Ensure unique dyn array
      if (i < High(FSetArray)) then
      begin
        FSetArray[i] := FSetArray[High(FSetArray)]; // Move last element
        Self.SetSorted(False);
      end;
      SetLength(FSetArray,Length(FSetArray)-1);
      Break;
    end;
  end;
end;

class operator TSet<T>.Subtract(const aSet1, aSet2: TSet<T>): TSet<T>;
begin
  Result := aSet1;
  Result.Exclude(aSet2.FSetArray);
end;

class operator TSet<T>.Subtract(const aSet: TSet<T>;
  const aTArr: TArray<T>): TSet<T>;
begin
  Result := aSet;
  Result.Exclude(aTArr);
end;

class operator TSet<T>.Subtract(const aSet: TSet<T>; aValue: T): TSet<T>;
begin
  Result := aSet;
  Result.Exclude(aValue);
end;

class operator TSet<T>.In(aValue: T; const aSet: TSet<T>): Boolean;
begin
  Result := aSet.Contains(aValue);
end;

class operator TSet<T>.In(const aTArr: TArray<T>; const aSet: TSet<T>): Boolean;
begin
  Result := aSet.Contains(aTArr);
end;

class operator TSet<T>.In(const aSet1: TSet<T>; const aSet2: TSet<T>): Boolean;
begin
  Result := aSet2.Contains(aSet1.FSetArray);
end;

function TSet<T>.Contains(aValue: T): Boolean;
var
  i : Integer;
  c : IEqualityComparer<T>;
begin
  if IsSorted then
  begin
    Result := Search(aValue);
  end
  else
  begin
    Result := false;
    c := TEqualityComparer<T>.Default;
    for i := 0 to High(FSetArray) do
      if c.Equals(FSetArray[i],aValue) then
        Exit(True);
  end;
end;

function TSet<T>.Contains(const aTArr: array of T): Boolean;
var
  i: Integer;
begin
  Result := High(aTArr) >= 0;
  for i := 0 to High(aTArr) do
  begin
    if IsSorted then
      Result := Search(aTArr[i])
    else
      Result := Contains(aTArr[i]);
    if not Result then
      Exit(false);
  end;
end;

function TSet<T>.Contains(const aTArr: TArray<T>): Boolean;
var
  i : Integer;
begin
  Result := High(aTArr) >= 0;
  for i := 0 to High(aTArr) do
  begin
    if IsSorted then
      Result := Search(aTArr[i])
    else
      Result := Contains(aTArr[i]);
    if not Result then
      Exit(false);
  end;
end;

function TSet<T>.Contains(const aSet: TSet<T>): Boolean;
begin
  Result := Contains(aSet.FSetArray);
end;

function TSet<T>.GetEmpty: Boolean;
begin
  Result := (Self.ItemCount = 0);
end;

function TSet<T>.GetItem(index: Integer): T;
begin
  Result := Self.FSetArray[index];
end;

function TSet<T>.GetItemCount: Integer;
begin
  Result := Length(Self.FSetArray);
end;

procedure TSet<T>.Clear;
begin
  SetLength(FSetArray,0);
  Self.SetSorted(False);
end;

// Get the mutually common elements, aka the intersect.
class operator TSet<T>.Multiply(const aSet: TSet<T>; aValue: T): TSet<T>;
begin
  Result:= aSet.Intersect(aValue);
end;

class operator TSet<T>.Multiply(const aSet: TSet<T>; const aTArr: TArray<T>): TSet<T>;
begin
  Result:= aSet.Intersect(aTArr);
end;

class operator TSet<T>.Multiply(const aSet1,aSet2: TSet<T>): TSet<T>;
begin
  Result := aSet1.Intersect(aSet2);
end;

function TSet<T>.Intersect(aValue : T): TSet<T>;
var
  i : Integer;
begin
  if Self.Contains(aValue) then
    Result.Include(aValue)
  else
    Result.Clear;
  Result.SetSorted(Result.ItemCount = 1);
end;

function TSet<T>.Intersect(const aSet: TSet<T>): TSet<T>;
var
  i,items : Integer;
begin
  SetLength(Result.FSetArray,aSet.ItemCount);
  items := 0;
  for i := 0 to High(aSet.FSetArray) do
  begin
    if Self.Contains(aSet.FSetArray[i]) then
    begin
      Result.FSetArray[items] := aSet.FSetArray[i];
      Inc(items);
    end;
  end;
  SetLength(Result.FSetArray,items);
  Result.SetSorted(Self.IsSorted and aSet.IsSorted);
end;

function TSet<T>.Intersect(const aTArr: array of T): TSet<T>;
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
  begin
    if Self.Contains(aTArr[i]) then
      Result.Include(aTArr[i]);
  end;
  Result.SetSorted(False);
end;

function TSet<T>.Intersect(const aTArr: TArray<T>): TSet<T>;
var
  i : Integer;
begin
  for i := 0 to High(aTArr) do
  begin
    if Self.Contains(aTArr[i]) then
      Result.Include(aTArr[i]);
  end;
  Result.SetSorted(False);
end;

//
function TSet<T>.Equal(aValue: T): Boolean;
begin
  Result := (Self.ItemCount = 1) and Self.Contains(aValue);
end;

function TSet<T>.Equal(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean;
begin
  if checkDuplicates then
    Result :=
      (Self.ItemCount <= Length(aTArr)) and
      Self.Equal(TSet<T>.Create(aTArr,True)) // Remove possible duplicates
  else
    Result :=
      (Self.ItemCount = Length(aTArr)) and
      Self.Contains(aTArr);
end;

function TSet<T>.Equal(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean;
begin
  if checkDuplicates then
    Result :=
      (Self.ItemCount <= Length(aTArr)) and
      Self.Equal(TSet<T>.Create(aTArr,True)) // Remove possible duplicates
  else
    Result :=
      (Self.ItemCount = Length(aTArr)) and
      Self.Contains(aTArr);
end;

function TSet<T>.Equal(const aSet: TSet<T>): Boolean;
begin
  Result :=
    (Self.ItemCount = aSet.ItemCount) and
    Contains(aSet);
end;

class operator TSet<T>.Equal(const aSet: TSet<T>; aValue: T): Boolean;
begin
  Result := aSet.Equal(aValue);
end;

class operator TSet<T>.Equal(const aSet: TSet<T>; aTArr: TArray<T>): Boolean;
begin
  Result := aSet.Equal(aTArr,True);
end;

class operator TSet<T>.Equal(const aSetLeft,aSetRight: TSet<T>): Boolean;
begin
  Result := aSetLeft.Equal(aSetRight);
end;

//  Self <= aSet
function TSet<T>.SubsetOf(aValue: T): Boolean;
begin
 Result := (Self.ItemCount = 1) and Self.Equal(aValue);
end;

function TSet<T>.SubsetOf(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean;
begin
  Result := Self.SubsetOf(TSet<T>.Create(aTArr,checkDuplicates));
end;

function TSet<T>.SubsetOf(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean;
begin
  Result := SubsetOf(TSet<T>.Create(aTArr,checkDuplicates));
end;

function TSet<T>.SubsetOf(const aSet: TSet<T>): Boolean;
begin
  Result :=
    (Self.ItemCount <= aSet.ItemCount) and
    aSet.Contains(Self);
end;

// SubsetOf (Left <= Right)
class operator TSet<T>.LessThanOrEqual(const aSet: TSet<T>; aValue: T): Boolean;
begin
  Result := aSet.SubsetOf(aValue);
end;

class operator TSet<T>.LessThanOrEqual(const aSet: TSet<T>; aTArr: TArray<T>): Boolean;
begin
  Result := aSet.SubsetOf(aTArr,True);
end;

class operator TSet<T>.LessThanOrEqual(const aSetLeft,aSetRight: TSet<T>): Boolean;
begin
  Result := aSetLeft.SubsetOf(aSetRight);
end;

// Self >= aSet
function TSet<T>.SupersetOf(const aSet: TSet<T>): Boolean;
begin
  Result :=
    (Self.ItemCount >= aSet.ItemCount) and
    Self.Contains(aSet);
end;

function TSet<T>.SupersetOf(aValue: T): Boolean;
begin
  Result := (Self.ItemCount >= 1) and Self.Contains(aValue);
end;

function TSet<T>.SupersetOf(const aTArr: array of T; checkDuplicates: Boolean = False): Boolean;
begin
  Result := SupersetOf(TSet<T>.Create(aTArr,checkDuplicates));
end;

function TSet<T>.SupersetOf(const aTArr: TArray<T>; checkDuplicates: Boolean = False): Boolean;
begin
  Result := SupersetOf(TSet<T>.Create(aTArr,checkDuplicates));
end;

// SupersetOf (Left >= Right)
class operator TSet<T>.GreaterThanOrEqual(const aSet: TSet<T>; aValue: T): Boolean;
begin
  Result := aSet.SupersetOf(aValue);
end;

class operator TSet<T>.GreaterThanOrEqual(const aSet: TSet<T>; aTArr: TArray<T>): Boolean;
begin
  Result := aSet.SupersetOf(aTArr,True);
end;

class operator TSet<T>.GreaterThanOrEqual(const aSetLeft,aSetRight: TSet<T>): Boolean;
begin
  Result := aSetLeft.SupersetOf(aSetRight);
end;

// A xor B; A=[1,2,3]; B=[2,3,4]; (A+B)-A.Intersect(B) = [1,4] alt:
function TSet<T>.SymmetricDiff(aValue: T): TSet<T>;
begin
  Result := Self;
  Result.Include(aValue);
  Result.Exclude(Self.Intersect(aValue));
  Result.SetSorted(False);
end;

function TSet<T>.SymmetricDiff(const aTArr: TArray<T>): TSet<T>;
begin
  Result := Self;
  Result.Include(aTArr);
  Result.Exclude(Self.Intersect(aTArr));
  Result.SetSorted(False);
end;

function TSet<T>.SymmetricDiff(const aTArr: array of T): TSet<T>;
begin
  Result := Self;
  Result.Include(aTArr);
  Result.Exclude(Self.Intersect(aTArr));
  Result.SetSorted(False);
end;

function TSet<T>.SymmetricDiff(const aSet: TSet<T>): TSet<T>;
begin
  Result:= Self;
  Result.Include(aSet);
  Result.Exclude(Self.Intersect(aSet));
  Result.SetSorted(False);
end;

class operator TSet<T>.LogicalXor(const aSet: TSet<T>; aValue: T): TSet<T>;
begin
  Result := aSet.SymmetricDiff(aValue);
end;

class operator TSet<T>.LogicalXor(const aSet: TSet<T>; aTArr: TArray<T>): TSet<T>;
begin
  Result := aSet.SymmetricDiff(aTArr);
end;

class operator TSet<T>.LogicalXor(const aSet1,aSet2 : TSet<T>): TSet<T>;
begin
  Result := aSet1.SymmetricDiff(aSet2);
end;

procedure TSet<T>.Sort;
begin
  SetLength(Self.FSetArray,Length(Self.FSetArray)); // Ensure COW
  TArray.Sort<T>(Self.FSetArray);
  SetSorted(True);
end;

function TSet<T>.Search(aValue: T): Boolean;
var
  Index: Integer;
begin
  Result := TArray.BinarySearch<T>(Self.FSetArray,aValue,Index);
end;

function TSet<T>.GetSorted: Boolean;
begin
  Result := (FSorted = '1');
end;

procedure TSet<T>.SetSorted(sorted: Boolean);
begin
  if sorted then
    FSorted := '1'
  else
    FSorted := '0';
end;

end.

A benchmark:

program ProjectGenericSet;

{$APPTYPE CONSOLE}

uses
  System.Diagnostics,
  System.Generics.Defaults,
  System.Generics.Collections,
  GenericSet in 'GenericSet.pas';

var
  set1,set2,set3 : TSet<Word>;
  sw : TStopWatch;
  ok : Boolean;
  i,j,max: Integer;
begin
  Randomize;
  max := $10000;
  // Populate a sample set with 32K items.
  repeat
    set1.Include(Random(max));
  until (set1.ItemCount = (max DIV 2));
  // Populate a test set with items in sample set
  repeat
    set2.Include(set1[Random(max DIV 2)]);
  until (set2.ItemCount = 100);
  WriteLn('Test in Sample (unsorted), 1.000 iterations...');
  sw := TStopWatch.StartNew;
  for i  := 1 TO 1000 DO
    ok := set1.Contains(set2);
  sw.Stop;
  WriteLn('Result:',ok,' ',sw.ElapsedMilliseconds,' [ms]');
  set1.Sort; // Sort
  WriteLn('Test in Sample (sorted), 200.000 iterations...');
  sw := TStopWatch.StartNew;
  for i  := 1 TO 200000 DO
  begin
    ok := set1.Contains(set2);
  end;
  sw.Stop;
  WriteLn('Result:',ok,' ',sw.ElapsedMilliseconds,' [ms]');
  WriteLn('Test*Test (unsorted), 200.000 iterations...');
  sw := TStopWatch.StartNew;
  for i  := 1 TO 200000 DO
  begin
    set3 := set2.Intersect(set2);
  end;
  sw.Stop;
  WriteLn('Result:',set3=set2,' ',sw.ElapsedMilliseconds,' [ms]');
  set2.Sort;
  WriteLn('Test*Test (sorted), 200.000 iterations...');
  sw := TStopWatch.StartNew;
  for i  := 1 TO 200000 DO
  begin
    set3 := set2.Intersect(set2);
  end;
  sw.Stop;
  WriteLn('Result:',set3=set2,' ',sw.ElapsedMilliseconds,' [ms]');
  ReadLn;
end.

Upvotes: 13

Johan
Johan

Reputation: 76597

You're dealing with a many-to-many relationship here.
If it were a database that means you'd put in 3 tables:

1. People
2. Colors
3. Link table between 1 and 2

I suggest you either fix the problem by utilizing a database or model the thing in Delphi just like it where a database.

Using Delphi structures
Furthermore stop using shortstring They are outdated and have zero benefits over longstrings.
Using 3 tables means you can quickly get a list of people per color and colors per person.

Here's how it would work:

TPerson = record
  name: string;
  other_data....
end;

TPeople = array of TPerson;

TFavColor = record
  name: string;
  other_data....
end;

TFavColors = array of TFavColor;

TPersonColor = record
  PersonIndex: Cardinal;  <<-- index into the TPeople array
  ColorIndex: Cardinal;   <<-- index into the TFavColors array
end;

TPersonColors = array of TPersonColor;

Now you can just loop over the TPersonColors array to extract your data.

Using a database
In SQL it would be even faster because your data is indexed (foreign key are (should be) always indexed).

The SQL statement the see all people that like blue and red would look like (using MySQL syntax here):

SELECT p.name  
FROM person p
INNER JOIN personcolor pc ON (pc.person_id = p.id)
INNER JOIN color c1 ON (pc.color_id = c1.id)
INNER JOIN color c2 ON (pc.color_id = c2.id)
WHERE c1.name = 'red' AND c2.name = 'blue'
GROUP BY p.id <<-- eliminate duplicates (not sure it's needed)

Using Delphi its trivial to link a database to your app.
So that's the route I'd recommend.

Upvotes: 1

Related Questions