Paul
Paul

Reputation: 3283

What is the fastest XML Parser available for Delphi?

We have reasonably large XML strings which we currently parse using MSXML2

I have just tried using MSXML6 hoping for a speed improvement and have got nothing!

We currently create a lot of DOM Documents and I guess there may be some overhead in constantly interacting with the MSXML2/6 dll

Does anyone know of a better/faster XML component for Delphi?

If anyone can suggest an alternative, and it is faster, we would look to integrate it, but that would be a lot of work, so hopefully the structure would not be too different to that used by MSXML

We are using Delphi 2010

Paul

Upvotes: 32

Views: 23031

Answers (5)

oxo
oxo

Reputation: 996

I know that it's an old question, but people might find it interesting:

I wrote a new XML library for Delphi (OXml): http://www.kluug.net/oxml.php

It features direct XML handling (read+write), SAX parser, DOM and a sequential DOM parser. One of the benefits is that OXml supports Delphi 6-Delphi XE5, FPC/Lazarus and C++Builder on all platforms (Win, MacOSX, Linux, iOS, Android).

OXml DOM is record/pointer based and offers better performance than any other XML library:

The read test returns the time the parser needs to read a custom XML DOM from a file (column "load") and to write node values to a constant dummy function (column "navigate"). The file is encoded in UTF-8 and it's size is about 5,6 MB.

XML parse comparison

The write test returns the time the parser needs to create a DOM (column "create") and write this DOM to a file (column "save"). The file is encoded in UTF-8 and it's size is about 11 MB.

XML write comparison

+ The poor OmniXML (original) writing performance was the result of the fact that OmniXML didn't use buffering for writing. Thus writing to a TFileStream was very slow. I updated OmniXML and added buffering support. You can get the latest OmniXML code from the SVN.

Upvotes: 23

teran
teran

Reputation: 3234

some time ago I had to serialize record to XML format; for ex:

 TTest = record
    a : integer;
    b : real; 
 end;

to

    <Data>
        <a type="tkInteger">value</a>
        <b type="tkFloat">value</b>
    </Data>

I used RTTI to recursively navigate through record fields and storing values to XML. I've tried few XML Parsers. I did't need DOM model to create xml, but needed it to load it back.

XML contained about 310k nodes (10-15MBytes); results presented in table below, there are 6 columns with time in seconds;
1 - time for creating nodes and write values
2 - SaveToFile();
3 = 1 + 2
4 - LoadFromFile();
5 - navigate through nodes and read values
6 = 4 + 5
enter image description here

MSXML/Xerces/ADOM - are differend vendors for TXMLDocument (DOMVendor)
JanXML doesn't work with unicode; I fixed some errors, and saved XML, but loading causes AV (or stack overflow, I don't remember);
manual - means manually writing XML using TStringStream.

I used Delphi2010, Win7x32, Q8200 CPU/2.3GHz, 4Gb of RAM.

update: You can download source code for this test (record serialization to XML using RTTI) here http://blog.karelia.pro/teran/files/2012/03/XMLTest.zip All parsers (Omni, Native, Jan) are included (now nodes count in XML is about 270k), sorry there are no comments in code.

Upvotes: 33

Lars Frische
Lars Frische

Reputation: 329

Recently I had a similar issue where using the MSXML DOM parser proved to be too slow for the given task. I had to parse rather large documents > 1MB and the memory consumption of the DOM parser was prohibitive. My solution was to not use a DOM parser at all, but to go with the event driven MSXML SAX parser. This proved to be much, much faster. Unfortunately the programming model is totally different, but dependent on the task, it might be worth it. Craig Murphy has published an excellent article on how to use the MSXML SAX parser in delphi: SAX, Delphi and Ex Em El

Upvotes: 12

g2mk
g2mk

Reputation: 507

Someday I have written very simple XML test suite. It serves MSXML (D7 MSXML3?), Omni XML (bit old) and Jedi XML (latest stable).

Test results for 1,52 MB file:

XML file loading time MSXML: 240,20 [ms]

XML node selections MSXML: 1,09 [s]

XML file loading time OmniXML: 2,25 [s]

XML node selections OmniXML: 1,22 [s]

XML file loading time JclSimpleXML: 2,11 [s]

and access violation for JclSimpleXML node selections :|

Unfortunately I actually haven't much time to correct above AV, but sorces are contained below...

fmuMain.pas

program XmlEngines;

uses
  FastMM4,
  Forms,
  fmuMain in 'fmuMain.pas' {fmMain},
  uXmlEngines in 'uXmlEngines.pas',
  ifcXmlEngine in 'ifcXmlEngine.pas';

{$R *.res}

begin
  Application.Initialize;
  Application.Title := 'XML Engine Tester';
  Application.CreateForm(TfmMain, fmMain);
  Application.Run;
end.

fmuMain.pas

unit fmuMain;

interface

uses
  Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
  Dialogs, xmldom, XMLIntf, msxmldom, XMLDoc,
  //
  ifcXmlEngine, StdCtrls;

type
  TfmMain = class(TForm)
    mmoDebug: TMemo;
    dlgOpen: TOpenDialog;

    procedure FormCreate(Sender: TObject);
    procedure FormDestroy(Sender: TObject);

    procedure mmoDebugClick(Sender: TObject);

  private
    fXmlEngines: TInterfaceList;
    function Get_Engine(const aIx: Integer): IXmlEngine;

  protected
    property XmlEngine[const aIx: Integer]: IXmlEngine read Get_Engine;

    procedure Debug(const aInfo: string); // inline

  public
    procedure RegisterXmlEngine(const aEngine: IXmlEngine);

  end;

var
  fmMain: TfmMain;

implementation

{$R *.dfm}

uses
  uXmlEngines, TZTools;

{ TForm1 }

function TfmMain.Get_Engine(const aIx: Integer): IXmlEngine;
begin
  Result:= nil;
  Supports(fXmlEngines[aIx], IXmlEngine, Result)
end;

procedure TfmMain.RegisterXmlEngine(const aEngine: IXmlEngine);
var
  Ix: Integer;
begin
  if aEngine = nil then
    Exit; // WARRNING: program flow disorder

  for Ix:= 0 to Pred(fXmlEngines.Count) do
    if XmlEngine[Ix] = aEngine then
      Exit; // WARRNING: program flow disorder

  fXmlEngines.Add(aEngine)
end;

procedure TfmMain.FormCreate(Sender: TObject);
begin
  fXmlEngines:= TInterfaceList.Create();
  dlgOpen.InitialDir:= ExtractFileDir(ParamStr(0));
  RegisterXmlEngine(TMsxmlEngine.Create(Self));
  RegisterXmlEngine(TOmniXmlEngine.Create());
  RegisterXmlEngine(TJediXmlEngine.Create());
end;

procedure TfmMain.mmoDebugClick(Sender: TObject);

  procedure TestEngines(const aFilename: TFileName);

    procedure TestEngine(const aEngine: IXmlEngine);
    var
      PerfCheck: TPerfCheck;
      Ix: Integer;
    begin
      PerfCheck := TPerfCheck.Create();
      try

        PerfCheck.Init(True);
        PerfCheck.Start();
        aEngine.Load(aFilename);
        PerfCheck.Pause();
        Debug(Format(
          'XML file loading time %s: %s',
          [aEngine.Get_ID(), PerfCheck.TimeStr()]));

        if aEngine.Get_ValidNode() then
        begin
          PerfCheck.Start();
          for Ix:= 0 to 999999 do
            if aEngine.Get_ChildsCount() > 0 then
            begin

              aEngine.SelectChild(Ix mod aEngine.Get_ChildsCount());

            end
            else
              aEngine.SelectRootNode();

          PerfCheck.Pause();
          Debug(Format(
            'XML nodes selections %s: %s',
            [aEngine.Get_ID(), PerfCheck.TimeStr()]));
        end

      finally
        PerfCheck.Free();
      end
    end;

  var
    Ix: Integer;
  begin
    Debug(aFilename);
    for Ix:= 0 to Pred(fXmlEngines.Count) do
      TestEngine(XmlEngine[Ix])
  end;

var
  CursorBckp: TCursor;
begin
  if dlgOpen.Execute() then
  begin

    CursorBckp:= Cursor;
    Self.Cursor:= crHourGlass;
    mmoDebug.Cursor:= crHourGlass;
    try
      TestEngines(dlgOpen.FileName)
    finally
      Self.Cursor:= CursorBckp;
      mmoDebug.Cursor:= CursorBckp;
    end

  end
end;

procedure TfmMain.Debug(const aInfo: string);
begin
  mmoDebug.Lines.Add(aInfo)
end;

procedure TfmMain.FormDestroy(Sender: TObject);
begin
  fXmlEngines.Free()
end;

end.

ifcXmlEngine.pas

unit ifcXmlEngine;

interface

uses
  SysUtils;

type
  TFileName = SysUtils.TFileName;

  IXmlEngine = interface
    ['{AF77333B-9873-4FDE-A3B1-260C7A4D3357}']
    procedure Load(const aFilename: TFileName);
    procedure SelectRootNode();
    procedure SelectChild(const aIndex: Integer);
    procedure SelectParent();
    //
    function Get_ID(): string;
    function Get_ValidNode(): Boolean;
    function Get_ChildsCount(): Integer;
    function Get_HaveParent(): Boolean;
    //function Get_NodeName(): Boolean;
  end;

implementation

end.

uXmlEngines.pas

unit uXmlEngines;

interface

uses
  Classes,
  //
  XMLDoc, XMLIntf, OmniXml, JclSimpleXml,
  //
  ifcXmlEngine;

type
  TMsxmlEngine = class(TInterfacedObject, IXmlEngine)
  private
    fXmlDoc: XMLDoc.TXMLDocument;
    fNode: XMLIntf.IXMLNode;

  protected

  public
    constructor Create(const aOwner: TComponent);
    destructor Destroy; override;

    procedure Load(const aFilename: TFileName);
    procedure SelectRootNode();
    procedure SelectChild(const aIndex: Integer);
    procedure SelectParent();
    //
    function Get_ID(): string;
    function Get_ValidNode(): Boolean;
    function Get_ChildsCount(): Integer;
    function Get_HaveParent(): Boolean;
    //function Get_NodeName(): Boolean;

  end;

  TOmniXmlEngine = class(TInterfacedObject, IXmlEngine)
  private
    fXmlDoc: OmniXml.IXmlDocument;
    fNode: OmniXml.IXMLNode;

  protected

  public
    constructor Create;
    destructor Destroy; override;

    procedure Load(const aFilename: TFileName);
    procedure SelectRootNode();
    procedure SelectChild(const aIndex: Integer);
    procedure SelectParent();
    //
    function Get_ID(): string;
    function Get_ValidNode(): Boolean;
    function Get_ChildsCount(): Integer;
    function Get_HaveParent(): Boolean;
    //function Get_NodeName(): Boolean;

  end;

  TJediXmlEngine = class(TInterfacedObject, IXmlEngine)
  private
    fXmlDoc: TJclSimpleXML;
    fNode: TJclSimpleXMLElem;

  protected

  public
    constructor Create();
    destructor Destroy(); override;

    procedure Load(const aFilename: TFileName);
    procedure SelectRootNode();
    procedure SelectChild(const aIndex: Integer);
    procedure SelectParent();
    //
    function Get_ID(): string;
    function Get_ValidNode(): Boolean;
    function Get_ChildsCount(): Integer;
    function Get_HaveParent(): Boolean;
    //function Get_NodeName(): Boolean;

  end;

implementation

uses
  SysUtils;

{ TMsxmlEngine }

constructor TMsxmlEngine.Create(const aOwner: TComponent);
begin
  if aOwner = nil then
    raise Exception.Create('TMsxmlEngine.Create() -> invalid owner');

  inherited Create();
  fXmlDoc:= XmlDoc.TXmlDocument.Create(aOwner);
  fXmlDoc.ParseOptions:= [poPreserveWhiteSpace]
end;

destructor TMsxmlEngine.Destroy;
begin
  fXmlDoc.Free();
  inherited Destroy()
end;

function TMsxmlEngine.Get_ChildsCount: Integer;
begin
  Result:= fNode.ChildNodes.Count
end;

function TMsxmlEngine.Get_HaveParent: Boolean;
begin
  Result:= fNode.ParentNode <> nil
end;

function TMsxmlEngine.Get_ID: string;
begin
  Result:= 'MSXML'
end;

//function TMsxmlEngine.Get_NodeName: Boolean;
//begin
//  Result:= fNode.Text
//end;

function TMsxmlEngine.Get_ValidNode: Boolean;
begin
  Result:= fNode <> nil
end;

procedure TMsxmlEngine.Load(const aFilename: TFileName);
begin
  fXmlDoc.LoadFromFile(aFilename);
  SelectRootNode()
end;

procedure TMsxmlEngine.SelectChild(const aIndex: Integer);
begin
  fNode:= fNode.ChildNodes.Get(aIndex)
end;

procedure TMsxmlEngine.SelectParent;
begin
  fNode:= fNode.ParentNode
end;

procedure TMsxmlEngine.SelectRootNode;
begin
  fNode:= fXmlDoc.DocumentElement
end;

{ TOmniXmlEngine }

constructor TOmniXmlEngine.Create;
begin
  inherited Create();
  fXmlDoc:= OmniXml.TXMLDocument.Create();
  fXmlDoc.PreserveWhiteSpace:= true
end;

destructor TOmniXmlEngine.Destroy;
begin
  fXmlDoc:= nil;
  inherited Destroy()
end;

function TOmniXmlEngine.Get_ChildsCount: Integer;
begin
  Result:= fNode.ChildNodes.Length
end;

function TOmniXmlEngine.Get_HaveParent: Boolean;
begin
  Result:= fNode.ParentNode <> nil
end;

function TOmniXmlEngine.Get_ID: string;
begin
  Result:= 'OmniXML'
end;

//function TOmniXmlEngine.Get_NodeName: Boolean;
//begin
//  Result:= fNode.NodeName
//end;

function TOmniXmlEngine.Get_ValidNode: Boolean;
begin
  Result:= fNode <> nil
end;

procedure TOmniXmlEngine.Load(const aFilename: TFileName);
begin
  fXmlDoc.Load(aFilename);
  SelectRootNode()
end;

procedure TOmniXmlEngine.SelectChild(const aIndex: Integer);
begin
  fNode:= fNode.ChildNodes.Item[aIndex]
end;

procedure TOmniXmlEngine.SelectParent;
begin
  fNode:= fNode.ParentNode
end;

procedure TOmniXmlEngine.SelectRootNode;
begin
  fNode:= fXmlDoc.DocumentElement
end;

{ TJediXmlEngine }

constructor TJediXmlEngine.Create;
begin
  inherited Create();
  fXmlDoc:= TJclSimpleXML.Create();
end;

destructor TJediXmlEngine.Destroy;
begin
  fXmlDoc.Free();
  inherited Destroy()
end;

function TJediXmlEngine.Get_ChildsCount: Integer;
begin
  Result:= fNode.ChildsCount
end;

function TJediXmlEngine.Get_HaveParent: Boolean;
begin
  Result:= fNode.Parent <> nil
end;

function TJediXmlEngine.Get_ID: string;
begin
  Result:= 'JclSimpleXML';
end;

//function TJediXmlEngine.Get_NodeName: Boolean;
//begin
//  Result:= fNode.Name
//end;

function TJediXmlEngine.Get_ValidNode: Boolean;
begin
  Result:= fNode <> nil
end;

procedure TJediXmlEngine.Load(const aFilename: TFileName);
begin
  fXmlDoc.LoadFromFile(aFilename);
  SelectRootNode()
end;

procedure TJediXmlEngine.SelectChild(const aIndex: Integer);
begin
  fNode:= fNode.Items[aIndex]
end;

procedure TJediXmlEngine.SelectParent;
begin
  fNode:= fNode.Parent
end;

procedure TJediXmlEngine.SelectRootNode;
begin
  fNode:= fXmlDoc.Root
end;

end.

Upvotes: 6

menjaraz
menjaraz

Reputation: 7575

Give a try to himXML by himitsu.

It is released under MPL v1.1 , GPL v3.0 or LGPL v3.0 license.

You will have to register to the Delphi-Praxis (german) excellent Delphi site so as to be able to download:

It has a very impressive performance and the distribution includes demos demonstrating that. I've successfully used it in Delphi 2007, Delphi 2010 and Delphi XE.

Upvotes: 3

Related Questions