Adem
Adem

Reputation: 253

System.Move and Array of String

I am trying to move some array elements (of string) to some other position.

When I use System.Move(), FastMM4 reports leaks.

Here is a small snippet to show the problem:

procedure TForm1.Button2Click(Sender: TObject);
type
  TArrayOfStr = array of string;
const
  Count1 = 100;
  String1 = 'some string ';  {space at end}
var
  Array1: TArrayOfStr;
  Index1: Integer;
begin
  SetLength(Array1, Count1);
  Index1 := 0;
  while Index1 < Count1 do begin
    Array1[Index1] := String1 + IntToStr(Index1);
    Inc(Index1);
  end;
  System.Move(Array1[0], Array1[3], 2 * SizeOf(string)); {move 2 cells from cell 0 to cell 3}
  ShowMessage(Array1[3]);
end;

It probably has something to do with SizeOf(String) but I don't know what.

Could someone help me make the leaks go away?

Upvotes: 1

Views: 2354

Answers (5)

procedure  TamListVar<T>.Insert(aIndex: Integer; aItem: T);
begin
    InitCheck;
    if not  IsIndex(aIndex) then  Add(aItem)
    else
    begin
        Setlength(Items,FCount+1);
        System.Move(Items[aIndex], Items[aIndex + 1],  (FCount - aIndex) * SizeOf(Items[aIndex]));
        PPointer(@Items[aIndex])^:=nil;
        Items[aIndex]:= aItem;
        inc(FCount);
    end;
end;

Upvotes: 0

Adem
Adem

Reputation: 253

Here is the solution I have come up with. It is heavily inspired by System.Move().

As far as I could see, after doing a number of tests, it seems to work OK --no leaks reported by FastMM4.

Obvioulsy, this is not the hand-optimized asm routine I was after; but given my (lack of) talents in asm area, this has to do for now.

I'd be most appreciative if you commented on this --especially to point out any pitfalls, as well as any other (e.g. speed) improvements.

{ACount refers to the number of actual array elements (cells of strings), not their byte count.}
procedure MoveString(const ASource; var ATarget; ACount: NativeInt);
type
  PString = ^string;
const
  SzString = SizeOf(string);
var
  Source1: PString;
  Target1: PString;
begin
  Source1 := PString(@ASource);
  Target1 := PString(@ATarget);
  if Source1 = Target1 then Exit;
  while ACount > 0 do begin
    Target1^ := Source1^;
    //Source1^ := ''; {enable if you want to avoid duplicates}
    Inc(Source1);
    Inc(Target1);
    Dec(ACount);
  end;
end;

procedure TForm1.Button2Click(Sender: TObject);
type
  TArrayOfStr = array of string;
const
  Count1 = 100;
  String1 = 'some string ';  {space at end}
var
  Array1: TArrayOfStr;
  Index1: Integer;
begin
  SetLength(Array1, Count1);
  Index1 := 0;
  while Index1 < Count1 do begin
    Array1[Index1] := String1 + IntToStr(Index1);
    Inc(Index1);
  end;
  MoveString(Array1[0], Array1[3], 2); {move 2 cells from cell 0 to cell 3}
  ShowMessage(Array1[3]); {should be 'some string 0'}
  MoveString(Array1[3], Array1[0], 2); {move 2 cells from cell 3 to cell 0}
  ShowMessage(Array1[0]); {should be 'some string 0'}
end;

Upvotes: -2

Disillusioned
Disillusioned

Reputation: 14842

I'm adding a fundamentally different answer in which I present an option how you can do something you really and truly should not be doing at all.

Disclaimer: What I describe here is terrible advice and should not be done (but it would work). Feel free to learn from it, but don't use it.

As has already been discussed ad nauseam, your problem is that when you use Move you bypass reference counting.

  • The solution involves making "internal" reference counting unnecessary by working on the principal that no matter how many times a string is internally referenced, you'll hold only a single count to the string.
  • And you'll only remove that single count when you're certain you have no more "internal" references to the string.
  • Unfortunately, having abandoned internal reference counting, the only time you can be sure of this is when you've completely finished with your internal array.
  • At this point you must first manually clear the internal array.
  • Then reduce the reference count of all strings you had previously used by 1.
  • This implies you need a separate "master reference" to each string.

Warning
There are 2 immediate problems:

  • You need a second list to track the master reference count; wasting memory.
  • You cannot recover memory used by strings that are no longer referenced internally until you've finished with the internal array entirely because you've abandoned your ability to track internal reference counts.
    (Technically this is still a memory leak, albeit controlled. And FastMM won't report it provided you cleanup correctly.)

Without further ado, some sample code:

//NOTE: Deliberate use of fixed size array instead of dynamic to
//avoid further complications. See Yet Another Warning after code
//for explanation and resolution.
TStringArray100 = array[0..99] of string;
TBadStrings = class(TObject)
private
  FMasterStrings: TStrings;
  FInternalStrings: TStringArray100;
public
  ...
end;

constructor TBadStrings.Create()
begin
  FMasterStings := TStringList.Create;
end;

destructor TBadStrings.Destroy;
begin
  Clear;
  FMasterStrings.Free;
  inherited;
end;

procedure TBadStrings.Clear;
begin
  for I := 0 to 99 do
    Pointer(FInternalStrings[I]) := 0;

  //Should optimise to equivalent of
  //Move(0, FInternalStrings[0], 100 * SizeOf(String));

  //NOTE: Only now is it safe to clear the master list.
  FMasterStings.Clear;
end;

procedure TBadStrings.SetString(APos: Integer; AString: string);
begin
  FMasterStrings.Add(AString); //Hold single reference count

  //Bypass reference counting to assign the string internally
  //Equivalent to Move
  Pointer(FInternalStrings[APos]) := Pointer(AString);
end;

//Usage
begin
  LBadStrings := TBadStrings.Create;
  try
    for I := 0 to 199 do
    begin
      //NOTE 0 to 99 are set, then all overwritten with 100 to 199
      //However strings 0 to 99 are still in memory...
      LBadStrings.SetString(I mod 100, 'String ' + IntToStr(I));
    end;
  finally
    //...until cleanup.
    LBadStrings.Free;
  end;
end;

NOTE: You can add methods to do whatever you like using Move on FInternalStrings. It won't matter that those references aren't being tracked because the master reference can perform correct cleanup at the end. However....

WARNING: Anything you do to FInternalStrings MUST also bypass reference counting otherwise you'll have nasty side-effects. So it should go without saying you need to solidly guard access to the internal array. If client code gets direct access to the array, you can expect accidental 'abuse'.

Yet Another Warning: As commented in code this uses a fixed size array to avoid other problems. Said problems are that if you use a dynamic array, then resizing the array can apply reference counting. Increasing the array size shouldn't be a problem (I recall that being a pointer copy). However, when the size is decreased, elements that are discarded will be dereferenced as necessary. This means you'll have to take the precaution of first pointer nilling these elements before shrinking the array.

The above is a way you can bypass string reference counting in a controlled fashion. But let me reiterate what a terrible idea it is.

You should have no trouble concocting an artificial benchmark to demonstrate that it's faster. However, I seriously doubt it will provide any benefit in a real-world environment.
In fact, if it does you probably have a different problem entirely; because why on earth would be shuffling the strings around so much that time spent there overshadows other aspects of your application?

Upvotes: 0

Disillusioned
Disillusioned

Reputation: 14842

The String type in Delphi is a managed type. Delphi keeps record of references and dereferences and automatically releases memory allocated to the string when it's no longer being referenced.

The reason for the leak is that you are bypassing Delphi's management of the string type. You are simply overwriting the pointer that references the 4th string in the array. (and for that matter the 5th as well because of 2 * ...) So you now have strings in memory that are no longer referenced. But Delphi doesn't realise this and cannot free the memory.

Solution: write Array1[3] := Array1[0];


Edit:

I've fully answered your question; giving you everything you need to: 1) Understand why you've got the memory leaks. 2) And make the leaks go away.

Yet you're not satisfied... In comments you've explained that you're trying to improve a phantom performance problem you've conjured up via an artificial benchmark.

  • It's been explained to you that the only reason Move is a little faster than normal string assignment is that: string assignment needs to do additional record keeping to prevent memory leaks and access violations.
  • If you insist on using Move, you'll need to do said record keeping yourself. (Johan has even demonstrated how.)
  • And then you complain that this will slow down your Move "solution".
  • Seriously, take your pick: If you want to use string, you can have it a little faster with AV's and memory leaks OR a little slower but behaving correctly. There's no magic wand waving that's going fix it for you.
  • You could choose to abandon the string type (and all the goodness it gives you). I.e. Use array of char instead. Of course, you'll have to do all the memory management yourself, and see how far multi-byte string copying helps you.

I still maintain that if you ask a new question demonstrating a specific performance problem you're trying to solve, you'll get much better feedback.

E.g. In comments you've mentioned that you're trying to improve the performance of TStringList.

I have previously encountered a performance problem with TStringList in older versions of Delphi. It's generally fine even working with hundreds of thousands of items. However, even with only 10,000 strings, CommaText was noticeably slow; and at 40,000 it was almost unbearable.

The solution didn't involve trying to bastardise reference counting: because that's not the reason it was slow. It was slow because the algorithm was a little naive and performed a huge number of incremental memory allocations. Writing a custom CommaText method solved it.

Upvotes: 3

Johan
Johan

Reputation: 76743

Issues
The problem you are having has to do with the reference counting of the string.

  1. Leaks
    If there is already a string in the areas you're overwriting these strings will not get freed. These are the leaks you are reporting.

  2. Potential access violations
    You copy the string pointers, but you do not increase the reference count of the string. This will lead to access violations if the original strings ever get destroyed due to going out of scope. This is a very subtle bug and will bite you when you least expect it.

Best solution
It's far simpler to just let Delphi do the copying and then all internal bookkeeping will get done properly.

  {move 6 from cell 1 to cell 3}
  System.Move(Array1[0], Array1[3], 2 * SizeOf(string)); 
  //This does not increase the reference count for the string;
  //leading to problems at cleanup.

  Array1[3]:= Array1[0];
  Array1[4]:= Array1[1];  //in a loop obviously :-)
  //this increases the reference count of the string.

Note that Delphi does not copy the strings, it just copies the pointers and increases the ref counts as needed. It also frees any strings as needed.

Hack solution
You should manually clear the area first.
Using

for i:= start to finish do Array1[i]:= '';

The next part of this solution horrible hack is to manually increase the ref counts on the strings you've copied.
See: http://docwiki.embarcadero.com/RADStudio/Seattle/en/Internal_Data_Formats#Long_String_Types

procedure IncreaseRefCount(const str: string; HowMuch: integer = 1);
var
  Hack: PInteger;
begin
  Hack:= pointer(str);
  Dec(Hack,2); //get ref count
  Hack^:= AtomicIncrement(Hack^,HowMuch);
end;

  System.Move(Array1[0], Array1[3], 2 * SizeOf(string)); 
  IncreaseRefCount(Array1[3]);
  .... do this for every copied item.

Note that this hack is not completely thread safe if you get the strings from somewhere outside your array.

However if you are really really in need of speed it might be a solution to gain a wooping 2% in the performance of the copy.

Warning
Don't use this code to decrease ref counts manually, you'll run into thread-safety issues!

Need for speed
It is unlikely that simply coping a few strings leads to slowness.
There is no way to get out of the clean up issues if you insist on using managed strings.
On the other hand the overhead of the reference counting is really not that bad, so I suspect the reason for the slowness lies elsewhere; somewhere we can't see because you haven't told us your problem.

I suggest you ask a new question explaining what you're trying to do, why and where the slowness is hurting you.

Upvotes: 4

Related Questions